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SOME STATISTICAL METHODS FOR COMPARISON OF 
GROWTH CURVES 


C. RADHAKRISHNA Rao 
Indian Statistical Institute, Calcutta, India 


INTRODUCTION 


In experiments involving study of growth the observations on a 
growing organism can sometimes be obtained continuously in time 
as a curve but generally, and more conveniently, at a finite number of 
specified time points. Two problems of interest which can be studied 
from such data are as follows. One is to construct a simple stochastic 
model characterising the growth of an individual organism during a 
certain period of time. Another is to compare the characteristics of 
growth under different conditions such as diet, environment, etc. If 
we succeed in obtaining a simple growth model, the second problem 
leads to the comparison of the models applicable to different situations. 
But this is not absolutely necessary for comparison of growth curves 
for we might compare various physically definable and meaningful 
aspects of growth such as, total growth in a period, average growth 
rate, changes in growth rate although they may not completely explain 
the growth processes. 

An early example of comparison of growth curves is due to Wishart 
[1938]. To each individual growth curve classified by litter, sex, and 
treatment, a second-degree polynomial in time was fitted by the least 
squares method. The coefficients of linear and quadratic terms were 
taken to represent salient features of growth and the fifteen or so 
observations on an individual growth curve were replaced by these 
two coefficients. The analysis then consisted in comparing the mean 
values of these coefficients under different experimental conditions. 
The analysis was justified because a large portion of the differences in 
growth curves was concentrated in the linear growth rate and to a lesser 
extent in the differential rate of growth measured by the coefficient of 
the quadratic term. 

Comparison of mean values at fifteen time points instead of these 
two aspects of growth would be a less efficient procedure although 
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valid multivariate tests exist for such comparisons. The success then 
consists in replacing the various observations on growth by a few 
summary figures which lead to most efficient comparisons between 
groups. This is essential if comparisons have to be made on the basis 
of small samples and variations between individual growth curves 
are uncontrollably large. In fact, in such situations, effort should be 
made to reduce the data to the lowest possible number of dimensions 
without sacrificing the essential information. The purpose of this 
paper is to explore such possibilities and to develop the necessary tests 
of significance. 


1, A SIMPLE AND EXACT ANALYSIS FOR COMPARISON OF GROWTH 
CURVES. 


1.1. Comparison of rates of growth 


As stated above, the main problem, especially in small samples, 
is to obtain an adequate representation of a growth curve with the 
minimum possible number of factors on the basis of which significant 
differences could be established between differently treated groups of 
individuals. The emphasis at this stage is not on obtaining a model 
adequately describing the growth of an individual but on examining 
whether differences exist between groups of growth curves. It should, 
however, be noted that, with small samples, it will be impossible to 
discriminate among a large number of widely varying models. 

Let us replace the observations on growth at different time points 
by the initial value and successive differences giving the gain in growth 
in different periods. Symbolically they may be represented by 


Yor, Yi (1) 


If the growth rate is uniform during the period under study, it is possible 
to replace the series (1) by the initial value and an estimate of the 
growth rate. With these two variables, comparisons between groups 
can be carried out. Rate is rarely uniform and in general growth is a 
complicated function of time. In controlled experiments, it may be a 
monotonic decreasing function of time during the period of growth. 
If, however, time can be transformed by a function 7 = G(é) in such a 
way that the growth rate is uniform with respect to the chosen time 
metameter, then an adequate representation is available in terms of the 
initial value and the redefined uniform rate. 

Let us represent the length of the interval between the (7 — 1)-th 
and 7-th time points on the transformed time axis by g; . The increase 
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y; then corresponds to the time period g; so that an estimated rate of 
growth with respect to 7 is* 


= Gi 


and is proportional to >> y.g; . The set of observations representing 
an individual’s growth can then be replaced by y, and b. If the problem 
involves comparison of growth under different conditions, we need to 
test whether the mean value of b is the same in all groups by analysis 
of variance with respect to the single variable b, eliminating the initial 
value yo by analysis of covariance, if necessary. Thus there appears 
to be no difficulty when the g; are known. 

Fortunately the same test seems to be valid even if the g, are esti- 
mated from the data themselves in a certain way and this is obviously 
better than depending on a priori values of the g; . The estimate of 
g; is taken to be the grand mean** of y; , the gain in the 7-th interval, 
for all individuals included in the sample. The analysis of variance 
test is exact for such a choice of g; under the assumption of normality 
of the distribution of y; . The proof is immediate if we observe that 
analysis of variance depends on the differences in the averages, while 
estimates of g; , being based on totals, are distributed independently 
of these differences. We shall illustrate this method using the obser- 
vations on the growth of rats under three different conditions given in 
a paper by Box [1950]. 

The totals of 27 observations for each week provide estimates of 
9: 592,93, and g,. 


= = 673, = Do ys = 674. 


From these, the value of b = >~y,9,; is computed for each rat and given 
in Table 1 (after dividing by 1000 arbitrarily to reduce the scale) along 
with the observed values of gains in weight in the successive weeks. 


The analysis of variance and covariance for b and yp is given in 
Table 2. 


ll 


*This resembles the regression estimate. The average of (ys/gi) is another estimate and go also 
the simple average of yi. The appropriateness of the formula employed depends on the assumptions 
made on the variances of yi. The method of testing developed here is valid for all these types of esti- 
mates. 

**The method of estimation was proposed by G. Rasch of Denmark during a course of lectures 
on growth curves which he gave in India i in 1951. This can be shown to be the least squares estimate 

of the time metamet ng to the observed values under the assumption that the yi are 


uncorrelated and have the same variance. 
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TABLE 1 
IniTIAL WEIGHT AND WEEKLY GaINsS IN WEIGHTS OF Rats UNDER THREE 


DIFFERENT TREATMENTS 


Group 1, Control 


Y2 


28 
30 
34 
33 
23 
32 
23 
21 
23 
28 


Group 2, Thyroxin 


Y2 


SOON WN 


vo = initial weight 
wi = gain in Ist week 
y: = gain in 2nd week 


o 


vs = gain in 3rd week 
vs = gain in 4th week 
= hin + + + divs 


Yo ys ys b 
2 57 29 25 33 72.82 - | 
60 33 23 31 74.09 | 
ot 52 25 33 41 84.40 
i 49 18 29 35 73.18 
56 25 17 30 60.46 
: 46 24 29 22 67.37 
4 51 20 16 31 57.55 
_ 63 28 18 24 57.45 
P 49 18 22 28 57.74 
57 25 29 30 70.67 
No. Yo " = ys ya i | 
1 59 26 36 35 35 | 
2 54 17 19 20 28 
3 56 19 33 43 38 
4 59 26 31 32 29 
5 57 15 25 23 24 
6 52 21 24 19 24 
7 52 18° 35 33 33 
Group 3, Thiouracil | 
No. yo ys ys b 
: 61 25 23 11 9 42.89 
: : 59 21 21 10 11 39.91 
53 26 21 6 27 51.43 
59 29 12 11 11 39.25 
: 51 24 26 22 17 55.97 | 
. 51 24 17 8 19 43.28 
7 56 22 17 8 5 32.64 
3 58 ll 24 21 24 50.93 
§ 46 15 17 12 17 38.78 
' i 53 19 17 15 18 43.58 
4 
| 
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TABLE 2 
ANALYSIS OF VARIANCE AND COVARIANCE FOR b AND Yo 


Sw Mean 
Source df. Sw Bee Sy.v. || corrected | d.f. sq. F 
for y, 


Between 2 | 3691.87 — 0.39 10.19 || 3691.97 2 |1845.99 | 18.5 


Within 24 | 2297.01 35.98 517.81 2294.51 | 23 99.76 


Total 26 | 5988.88 35.59 528.00 || 5986.48} 25 


The variance ratio is significant showing that growth rates are 
different. An examination of the mean values of the regression co- 
efficients for the three groups, 


6, = 67.575, 6, = 68.758, 6, = 43.865, 


shows that the differences are mainly due to the smaller rate for the 
third group. The use of covariance analysis was not worthwhile because 
of an extremely poor or no correlation between b and y,. It may also 
be noted that the sum of squares between groups is practically un- 
changed when corrected for y) . This is probably due to some sort of 
balancing with respect to average initial weight in assigning the rats 
to the three groups. 

The following comments about the test proposed above are worth 
noting: 

(i) It provides a valid test of the null hypothesis that the average 
growth curve is the same under all treatment conditions irrespective of 
any assumptions on the nature of the growth curve. If the average 
growth curve can be represented by a straight-line trend for each 
group by a suitable choice of a common time metameter, then the 
above test utilizes in some sense all the relevant information about the 
comparison of the average growth curves. 

(ii) For the application of this test, it is not necessary to know the 
exact values of the time points at which observations are made. It 
is, of course, necessary that at each time point, the observations should 
have been taken on all the individuals in the experiment. 

(iii) There are many experiments in which the responses are obtained 
under different conditions or for different groups subjected to a graded 
set of doses not quantitatively measurable. Under the assumption 
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that response is an increasing function of dose, the technique developed 
above can be used. 

(iv) One can also fit a quadratic in the estimated time metameter 
for each individual curve and examine group differences in the second 
degree term and proceed to higher degree terms if necessary. 

(v) Besides a transformation of the time variable it may be desirable 
to also transform the variable under study (for instance, to the log- 
arithm in the case of weight) to secure a closer straight-line trend. 


1.2. Tests of further aspects of the null hypothesis concerning equality of 
average growth curves 


The analysis of Sec. 1.1 would result in a comparison of the growth 
curves in all relevant aspects only if the average curves in different 
groups could be made linear by a common time metameter. The 
hypothesis of the existence of a common transformation can, however, 
be subjected to a test by the procedure outlined in Sec. 2 when the 
sample size is large. 

If there are k groups to be compared in each of p measurements 
representing growth in p successive given time periods, then we have 
p(k — 1) degrees of freedom out of which (k — 1) degrees of freedom 
have been used in comparing the average growth as illustrated in 
Sec. 1.1. To obtain a test criterion based on the rest of the degrees of 
freedom, we formally define a hypothesis 


Msi Ms; Ly; 

Bet _ Bot Ext _ 9 

@) 
r,s=1,--: ,k; ,p, 


where y;; is the mean of y; for the 7-th group and the @; are the estimates 
obtained in Sec. 1.1. The equations (2) can be recognised as a set of 
linear hypotheses which can be tested by a suitable Wilk’s criterion 
whose computation is explained below. 

We first obtain an analysis of dispersions of the variables y; , y2 , 
y; , and y, as between and within groups. The sum of squares and 
products (S.P.) matrix within groups having 24 degrees of freedom is 
taken and to it is appended an extra column and row containing @, , @2 , 
gs , 9, , and zero in the pivotal position as shown below: 


582.3 42.5 —-55.5 -74.6 
609.0 626.5 344.5 
1046.7 459.0 

853.0 

0. 


| 
| 
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In the above matrix the elements below the diagonal are omitted 
because of symmetry. Substituting the values g, = 603, g. = 673, 
gs; = 570, g, = 674 obtained before, the value of the determinant A is 
computed by any standard method. Thus 


A(for error) = —1291 X 10°’. 


The same procedure is repeated with the total S.P. matrix (between + 
within groups) having (24 + 2) degrees of freedom. 
The matrix of the determinant to be evaluated is indicated by 


664.0 79.7 —44.0 38.3 603 
1085.9 1409.2 1131.9 673 

2362.6 1719.1 570 

2187.0 674 

0. 


A for (error + between) = — 3045 X 10”. 
The criterion is 


A(error) — 1291 _ 9 4041. 


“- A(error + between) 3045 
In the usual notation (see Rao [1952], p. 260), n = 26, gq = 2, and 
p = 3, since there are effectively 3 variables representing successive 
differences of (y,/g;). In this case an exact test is available in the 
form of a variance ratio 


with 2p = 6 and 2(n — p — 1) = 44 degrees of freedom. The observed 
value 3.93 exceeds the five percent value of F giving further evidence 
of differences between the growth curves under different treatments. 

The device used in defining the formal hypothesis (2) is just to 
obtain a statistic, which records deviations from the hypothesis specify- 
ing equality of growth curves and whose distribution is exact under that 
null hypothesis. The equations (2) cannot define a hypothesis unless 
the g; have assigned values and are not estimates. With given values 
of g; , the test criterion, whose computation is explained above, is valid 
for testing such a hypothesis. This hypothesis would imply that, 
with respect to a given time metameter, the average growth curves 
in all groups are linear or, in other words, there is no interaction between 
growth rates (with respect to the chosen time scale) and successive 
intervals of time. 


In large samples the estimates 9; tend to their expected values, in 
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terms of which the hypothesis under consideration can be strictly 
defined. In such a case, when stable values of ¢; are obtained, the test 
criterion derived using the estimates g; may be interpreted as pro- 
viding a test of goodness of fit of a linear trend with respect to a common 
time metameter or of interaction referred to above. But in finite 
samples, it overestimates significance as a test of this hypothesis. 
However, when the test does not show significance, one can conclude 
that there is no evidence of departure from this hypothesis. 


2. ALTERNATIVE TESTS VALID FOR LARGE SAMPLES 


The tests developed in Sec. 1 are simple in the sense that they do 
not involve complex computational techniques. More efficient tests 
depending on the roots of determinantal equations can be constructed 
but their exact distributions are not known. Fairly good approxima- 
tions to the percentage points are available when the sample size is 
large. 

Let us consider the problem of goodness of fit of linear trend with 
respect to a common time metameter discussed in Sec. 1.2. The follow- 
ing notations are used: 


k = number of groups, 
n; = sample size for the 7-th group, 
gy‘; = the observed average growth in the j-th period for the 7-th 


group, 


B,, = dingy, the uncorrected sum of squares and products 
between groups, 


T,, = the total uncorrected (for the mean) sum of squares and 
products within groups, and 


S,, = T,, — B,, , the corrected sum of squares and products within 
groups. 


Construct the determinantal equation 
|S—aT| =0. (3) 


The likelihood-ratio criterion for testing the goodness of fit of linear 
trend for growth curves in different groups is 


A = product of the (p — 1) largest roots of equation (3) 
| T | 


i 
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where X, is the smallest root. In large samples the statistic 


can be used as x” with (p — 1) (k — 1) degrees of freedom. With the 


multiplying coefficient suggested by Bartlett [1948], the x’-approxi- 
mation is 


which differs from (4) in the multiplying constant. 
The estimate of the common direction (u, , --- , u,) which provides 


a transformation of the time variable when not invalidated by the 
above test can be obtained from the latent vector corresponding to 


the smallest root of equation (3). If (h, , he , «++ , h,) is the latent 
vector, then 


where the 7',, are the elements of the matrix 7. These are, perhaps, 
more efficient estimates than those obtained in Sec. 1.1 by averaging 
the observations on gain in weight for each time interval. 

The likelihood-ratio criterion for testing the equality of regression 


coefficients (rate of growth with respect to the common time meta- 
meter) is found to be 


6) 


where (¢) is the matrix of corrected total sums of squares and products, 
while it may be recalled that (7) refers to the uncorrected sums of 
squares and products, and A, is the smallest root of equation (38). 
In this case we may use the variance-ratio approximation, 


with (k — 1) and (>on; — k) degrees of freedom. It may be observed 
that this test may be used even if the x’-test of (4, 5) is significant, 
but it acquires a special significance when a common transformation 
of the time axis is indicated by the A-test. 


3. INVESTIGATION OF GROWTH MODELS 
3.1. Tests based on the dispersion matrix 


In Sec. 1 and 2, tests were developed to examine whether, by a 
common transformation, the average growth curves of different groups 


| 
| 
| 
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can be made linear and also to test whether the slopes are the same for 
all the groups. No assumption was, however, made about an individual 
growth curve. The observations y, , --- , Yp , representing growth 
in the p time periods, were allowed to follow an arbitrary p-variate 
normal distribution. Modification and improvement may be desirable 
if something is known about the stochastic nature of growth. 

We may consider the model 


Yra = AagGg(t) + & (8) 


where y;q is the increase in the ¢-th interval, \, is a parameter specific 
to individual a, g(¢) an unknown function of time only and e, is a random 
error. The errors ¢, and ¢,- for any two time periods are taken to be 
uncorrelated. It is believed that, apart from a deterministic linear 
trend for growth with respect to some time metameter, there are in- 
dependent disturbances taking place in small intervals of time. Under 
such conditions the growth in any given period can be: represented by 
the expression (8). 

The model (8) implies that, by a common transformation 7 = g(t), 
all the individual growth curves can be made linear apart from random 
fluctuations. It may be observed that in the analysis of previous 
sections this model (without the random error) was used only for the 
true average growth curves of different groups. The particular stochas- 
tic nature of individual curves described by the model (8) was not 
used in tests of significance. 

The analogy of equation (8) with that used in factor analysis suggests 
the more general model 


where --- correspond to different factors and , g2, the 
regression coefficients. If such a representation is true, we should be 
able to replace the growth curve by its estimated factor values \"”, 
\, +++ and choose the dominant ones for further analysis. 

There is, however, one difference between the model (9) and factor 
analytic models which include a constant c(#) representing the mean 
of y(t). For this reason, we cannot directly apply the tests used in 
factor analysis. One could obtain the likelihood-ratio test appropriate 
to test the model (9) but this appears to be complicated. As a first 
step the tests developed in factor analysis to determine the number of 
factors etc. (see Lawley [1914], Rao [1945]) can be used since the dis- 
persion matrix of y, , y2 , -** is the same for both the models. The 
model (9) has, however, further restrictions on the mean values which 
have to be tested. 


| 
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If in this model we replace e, by ¢ independent of t, then we can use 
Hotelling’s principal component analysis to determine the number of 
factors. In this case, the test proposed by Bartlett [1950] in the in- 
vestigation of factors is applicable (see Rao [1954]). 


3.2. Estimation of factors and tests of significance for differences between 
groups 


As in Sec. 1, let us consider the problem of comparing growth curves 
classified under different treatments. Let us assume that 


= AQ’ gilt) +. 


and estimate all \ and g by minimising the expression 
[ee — — — 


which leads to the principal component analysis of the uncorrected 
sum of squares and products matrix. Let 7 represent this matrix, 
the typical element 7',, of which is computed from the formula 


(The sum over a@ is a sum over individuals). 
Consider the determinantal equation 
|7—ypI|=0 


and find the first k latent vectors corresponding to the first k dominant 
roots. 


Root Latent vector 
pi(max) gi(1), gi(2), , gi(p) 


The latent vectors provide the values of the functions g,(t), g2(t) --- 
g.(t) at the time points 1, 2, --- p. For any individual the \-values 
are obtained from the formulae 


= YraGi(l) + Yoagi(2) + + Yoags(p) 


= + You9r(2) + + 


which are linear combinations of the observations with the coefficients 
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provided by the latent vectors. By this process the p observations on 
increase in weight are replaced by k values which in some sense represent 
the dominant aspects of the growth curve. The methods of multi- 
variate analysis fur testing the differences between treatments, etc. 
can now be used on the & reduced variables though the tests may not 
be exact. The analysis can be undertaken in the order provided by 
the latent vectors, first testing differences in \“, and then differences 


in \ independently of \“”, and so on, eliminating the initial weight 
if necessary. 


4, ILLUSTRATIVE EXAMPLE OF GROWTH PROCESSES UNDER 
UNCONTROLLED CONDITIONS 


The methods discussed in this paper were originally developed with 
the intention of applying them on weights of babies obtained at periodic 
intervals during the first year of growth. The data were collected by 
Dr. M. N. Rao and Dr. B. Bhattacharyya of the All India Institute 
of Hygiene and Public Health with the intention of setting up the norms 
and variations for weights of babies during the first year of growth. 
Their results are published in two papers (Rao and Bhattacharyya 
[1952] and [1953]). 

The data collected by them is of great scientific interest since they 
provide a realistic picture of growth under natural (uncontrollable) 
conditions and this is what the authors were aiming at. About 100 
babies, 50 boys and 50 girls, were selected and investigators visited 
their houses in Calcutta periodically to obtain their weights. An 
inevitable feature of such an investigation is incomplete records. 
Observations could not be continued on all children even for such a 
short period as one year mainly due to death, disease of the children, 
and, to some extent, the transfer of parents to outside stations. On 
searching the records it was discovered that only in 14 cases for boys 
and 13 for girls could the observations be continued till the end of 
one year. (Nearly 75 percent casualties due to death, disease, and 
other causes!) This is an important factor which future investigators 
may bear in mind. 

The number of cases with complete records is not sufficiently large 
in this study to carry out any analysis of practical value but it would 
indeed be worthwhile examining how these 27 children have struggled 
during their first year of existence and survived to provide us with 
complete records. The reason for confining the analysis of the records 
to onlp 27 babies, omitting the incomplete records, is as follows: the 
characteristics of growth in the case of children with incomplete records 
due to death, disease, etc. will necessarily be different from those of 
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the rest, so much so that the information from the incomplete records 
could not be pooled with the rest either for the purpose of determining 
the form of the average growth curve or the stochastic nature of the 
individual growth curves over the entire time period. 

Table 3 gives the mean values of weights (absolute and logarithmic) 
for boys and girls at intervals of twenty days from birth, the time 
metameter with respect to which the growth is expected to have a 
linear trend (which is estimated by the method of Rasch by averaging 
the weights of all the 27 children), and also values for gains in weight 
during the successive intervals. 


2°50) 


245 - CHART-1. RELATIONSHIP BETWEEN AVERAGE LOGARITHMIC 
WEIGHT & THE CHOSEN TIME METAMETER A 


2°40; 


2°35- 


a 


2°20- 


LOGARITHMIC WEIGHT 


Go) 
@o) 
(80) 
(120) 
(140) 
(180) 
(200) 
@20) 
(240) 
+ (S00) 
620) 
(340) 


& Y95 200 205 210 215 220 225 230 235 240 245 250 
TIME METAMETER 
THE FIGURES WITHIN BRACKETS CORRESPOND TO THE ACTUAL NUMBER OF DAYS 


An examination of the figures in Table 3 and the chart based on 
them leads to the following conclusions. 
(i) The rate of growth in absolute weight or relative to absolute 
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weight appears to be generally greater for boys than for girls during 
the first year of growth. 

(ii) The rate of growth steadily decreases as time increases up to 
a certain time (220 days in this case uniformly for boys and girls) and 
then increases for a short while. This seems to demand an explanation.* 

In the interpretation of the rate of growth in the first 20 days one 
must take into account the fact that the weight of a baby decreases 
within a first few days after birth and then increases. 

(iii) With respect to the estimated time metameter the mean growth 
curves (Chart 1) for boys and girls appear to be faithfully linear. 
Although logarithmic weight is chosen for presentation in the chart 
the same is true of the absolute weights with respect to its own time 
metameter. Surprisingly, the individual growth curves also exhibit 
linear trend. They are not reproduced here for want of space. At 
least in the first year of growth it appears that growth is largely con- 
trolled by a single factor. 


5. THE PROBLEM OF THE DISCRIMINANT FUNCTION WITH 
CONTINUOUS CURVES 


In particular cases, if devices exist to record growth continuously 
with time, we have the problem of comparing the averages of entire 
curves and not merely at a number of time points. It may be necessary 
to consider the derivative curves for comparison. Our attempt here is 
only to develop the most general solution in the problem of discrimi- 
nation when observations consist of continuous curves. 

Let f.(¢) represent the observed curve for an individual a in the 
time interval, say (0, 1). The expectation curve for the group is 


Etfa()} = {(d) 
and the dispersion function is 
— ff’) = DU, 


The problem is one of determining a linear functional L{f(é)} with 
respect to which two given groups differ the most. For continuous 
functions f(t) it is known that the following integral representation of 
a linear functional holds 


Lio} = anid. 


*In India, generally, milk is the only diet given to a child for about six months. This is supple- 
mented by rice or some other form of starch between the 7-th and 9-th months depending on the family 
custom, 
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The problem then reduces to the determination of g(t). 

As in the p-variate problem we will maximise the ratio of the square 
of the difference in mean values for two groups to the common variance. 
If a and 8 denote individuals in the two groups, 


BIL — = £0} dato 


= [ agi, 


where d(t) is the difference between average curves. The variance of 


is 
[[ de, & ago age), 


The ratio to be maximised is 


SJ a) dg(t) dg(t’)/ SJ Dit, t’) dg@dg(?’). 


The function g(t) for which the above expression is a maximum is 
obtained as a solution of the integral equation 


= Dit, t’) dg(t’). 


There is no general method of solving this equation for any given 
dispersion function D(t, t’) except numerically by reducing it first to 
a problem of curves approximated by straight lines whose number is 
increased till g(¢) is determined with the requisite accuracy. This is 
equivalent to comparing the curves at a finite number of points each 
time. If special forms are assumed for the dispersion function, then 
direct solutions may exist. Further work done in this direction will be 
reported elsewhere. 
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HOW TO USE RIDIT ANALYSIS 


Irwin D. J. Bross 


Department of Public Health and Preventive Medicine, Cornell 
University Medical College, New York, U.S.A. 


1. Introduction 


In many scientific studies in the biological and behavioral sciences— 
probably in a majority of such studies—the scientist has to work with 
a response variable which falls in the ‘“‘borderland” between dichotomous 
classifications (e.g. “lived’’—“died,” and refined measure- 
ment systems (i.e. measurements which are highly reproducable at 
different times or at different places). Sometimes the response variable 
is a subjective scale (i.e. a well ordered series of categories such as 
“minor,” ‘‘moderate,”’ “‘severe”’). At other times the response variable 
takes numerical values but the measurement system is heavily dependent 
on the quality of experimental material, details of protocol, or the 
technical skill of the scientist. These “borderland’’ response variables 
may not be adequately analysed by the chi-square family of statistical 
methods and at the same time the t-test family of techniques may not 
be appropriate. In this situation ridit analysis may serve as a “missing 
link” between the two traditional families of statistical methods. 

This paper is addressed to scientists who are working with “‘border- 
land” response variables. It will contain no mathematical derivations 
(these will appear in a subsequent paper [1]). Its purpose is to explain 
and illustrate how to use ridit analysis in a scientific study. For this 
purpose a ridit analysis of data from the Cornell Automotive Crash 
Injury Research Program (ACIR) will be presented. This material 
will serve to illustrate the various problems that come up in actual 
studies, problems ranging from difficulties due to peculiarities and 
imperfections of the basic data to questions of presentation and inter- 
pretation of the results. The ACIR analysis will also show the role of 
ridit analysis in achieving the objectives of a scientific study. In this 
particular case the analysis throws light on a major public health 
problem—the carnage due to auto accidents. 

Nowadays the catalogue of statistical methods is so very extensive 
that a working scientist is somewhat less than overjoyed at the prospect 
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of having to learn yet another procedure. He would like the answers 
to certain obvious questions before he takes the time and trouble to 
learn about the new method. So here are some immediate answers: 

Is ridit analysis a practical technique or is it just another math- 
ematical toy? Is it so specialized that a researcher is not likely to 
have occasion to use it? Over the past several years I have used ridit 
analysis in well over a dozen different medical research studies. These 
studies have ranged from epidemiological and clinical investigations to 
laboratory animal and microbiological experiments. In one or two 
studies ridit analysis turned out to be impractical and in two other 
studies it worked no better than conventional methods. In the remain- 
ing studies ridit analysis reduced a complex and somewhat bewildering 
mass of data to a form where the scientist could see just what was 
going on and could answer the questions he set out to answer. In most 
of the favorable instances conventional methods could not be used or 
were ineffective. The method seems to be a practical general-purpose 
tool. 

Are the mechanics of ridit analysis hard to learn? Does it lead to 
extensive computation? The mechanics of ridit analysis can be learned 
in ten minutes (see Table 1). Apart from this one operation the com- 
putations are simply the usual means and variances of the t-test family 
of statistical methods. For quick, preliminary scrutiny of data there is 
a shortcut version of ridit analysis which eliminates sum-of-squares 
operations. 

Can ridit analysis be safely used on “borderland” data? The 
procedure is as safe as other statistical methods and may sometimes be 
safer because it is “distribution free’ in one sense. However no statis- 
tical method can guarantee complete safety with this type of data. 


2. Operational Definition of Ridits 


The name “ridits’”’ was chosen because of the analogy with “probits” 
and “‘logits”. Like other members of the “‘it”” family ridits represent 
a type of transformation. But whereas probits are relative to a theoretical 
distribution (the normal distribution), ridits are relative to an empirical 
distribution. The first three letters stand for Relative. to an Identified 
Distribution. In other words, ridits are based on the observed distri- 
bution of a response variable for a specified set of individuals. Ridits 
represent a new application of a very old idea (‘‘the probability trans- 
formation’’) and are closely related to distribution-free methods based 
on ranks (especially the Wilcoxon Test). The technique grew out of 
efforts to apply the rank ¢-test [2] to ACIR data (where the number of 
subseries was large). 
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The first step in the use of ridits is the choice of the identified dis- 
tribution—an important choice (and not always an easy one). Once 
this choice is made the calculation of ridits is a simple routine process, 
the mechanics of which are shown in Table 1. Column 1 of Table 1 
gives the distribution (with respect to a subjective injury scale) of the 
individuals in the “identified distribution” of the ACIR study. Thus 
of 179 persons, 17 were reported as not injured and 14 were fatally 
injured. The ridits are calculated from the numbers in column 1 
according to the instructions listed below the table. 

The ridit for a given category is simply the proportion of individuals 
injured to a lesser degree plus one half the proportion of individuals in 
the category itself. 


_ TABLE 1 
CaLcuLATION OF Ripits (ComputinG Form) 
(1) (2) (3) (4) (5) 
None 17 8.5 0 8.5 0.047 
Minor 54 27.0 17 44.0 0.246 
Moderate 60 30.0 71 101.0 0.564 
Severe 19 9.5 131 140.5 0.785 
Serious 9 4.5 150 154.5 0.863 
Critical 6 3.0 159 162.0 0.905 
Fatal 14 7.0 165 172.0 0.961 
Total 179 179 


Instructions: 
Column (1): The frequency distribution in the identified distribution (reference class) 
Column (2): One-half of the corresponding entry in Column (1) 
Column (3): The cumulate of Column (1) (displaced one category downward) 
Column (4): Column (2) + Column (3) 
Column (5): The entries in Column (4) divided by grand total (179). The numbers are the 
ridits. 


The operations in Table 1 can be viewed as a method of assigning 
a number (or weight) to the graded categories of the injury scale. In 
other words a person whose degree of injury was previously described 
by a name (i.e. “‘severe”) now has his degree of injury described by a 
number (i.e. 0.785). This same set of ridits or weights was used to 
describe the degree of injury for all of the 2253 persons in this study 
(not just those in the identified distribution). 

The purpose of the study was to study the role of three factors 
(ejection from the vehicle, seated position, accident severity—damage 
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to the car) in the production of injury. Cross-tabulation by these factors 
led to 45 categories of occupants. Since the degree of injury has been 
converted to a number or ridit it is therefore possible to calculate 
average ridits in a category and in general proceed along the lines of 
the t-test family of statistical methods. Whether it is meaningful or 
valid to do so is a question which will be considered in subsequent 
sections. 


3. The Choice of the Identified Distribution 


For the ACIR data a specified set of individuals—less than 10% 
of the sample—was used to calculate the ridits. Subsequently other 
sets of individuals were compared with this reference set so that in 
effect the identified distribution provided the baseline for the study. 
The choice of the identified distribution is not automatic in ridit analysis 
and this is one of the things that distinguishes ridits from related 
techniques. Indeed in ridit analysis the choice is important and some- 
times it is crucial. 

The key to an intelligent choice of the baseline is a clear picture 
of our goal. If we are dealing with a continuing research program we 
would like numbers which will be repeatable from study to study. If 
there are other research programs studying the same phenomena, we 
would like numbers which will be comparable from one program to the 
next. In short our goal is to achieve the space-time stability of refined 
measurement systems even though we are starting from a “‘borderland” 
response variable. 

Sometimes there is a natural choice of a baseline which will lead 
toward this goal. In a continuing analgesic testing program new drugs 
are being tested against a standard (morphine) with respect to a pain- 
relief scale. The extensive information that has accumulated on 
patient response to morphine provides a natural baseline. In another 
clinical trial a placebo, standard, and test laxative were given to “well 
normal’ individuals (medical students) and various patient groups. 
The natural reference set was the responses of well normals to placebos. 
The medical students are a more homogeneous group than a patient 
series. Furthermore investigators at other institutions are likely to 
have access to a similar group since medical students are favorite 
“guinea pigs” in clinical research. 

Occasionally the study series as a whole will serve as a reference 
set because it is representative of some larger population. Thus in a 
mental health study a random sample was drawn from an urban popu- 
lation and the ridits were based on all the observations. However, it — 
is not wise to automatically use the totality of observations since (as 
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would be the case in the above mentioned laxative study) the reference 
set may not be homogeneous and may not be representative of anything. 
Of course it is possible to go through the motions for any specified set 
but the resulting numbers are meaningless in terms of the context of 
the study. 

Now and then the choice of reference set depends on several factors 
and some compromise is necessary. The ACIR identified distribution 
is a case in point. The ACIR program is a continuing one so it is 
desirable to have numbers which are comparable over time. The refer- 
ence set should be large enough to insure that the ridits will be stable. 
Since the study is adding new states to the sample, the composition of 
the sample varies somewhat (for example accident severity tends to 
depend on traffic conditions). Hence we would like to choose the 
reference class to minimize various artifactual effects. It is also desirable 
to have a “centrally located” series—one which will span the full range 
of the response variable. In an attempt to strike a favorable balance 
between these and several other factors, the following specifications 
were set for the ACIR identified distribution. The set consists of those 
occupants who (a) were not ejected from the car, (b) were seated in 
the front seat of the car, (c) were in a car with at least one other occupant, 
(d) were in a car which sustained “‘severe” damage (accident severity). 


4. Average Ridits and Their Confidence Intervals 


Once the reference distribution (identified distribution) has been 
chosen and the corresponding ridits are computed, there is associated 
with each occupant a numerical quantity (ridit) which can serve as a 
measure of degree of. injury. We therefore are able to carry out the 
familiar statistical operations such as calculating the average for a set 
of occupants, the variance in a given set, confidence intervals for the 
average, and more complicated numerical operations. Although we 
can perform such computations merely by using the formulas in text- 
books of elementary statistics [3], we have to remember that the numbers 
we are using are not measurements like those of the pocket ruler. If 
we are using ridits solely for qualitative conclusions (such as would be 
obtained from tests of significance), we may not need to worry a great 
deal about the peculiarities of subjective scales. On the other hand, 
if we wish to derive quantitative results, we must be able to interpret the 
arithmetic mean in some way, that is, we have to regard the mean as 
an estimate of “something” and then identify that “something.” 

The average ridit has a probability interpretation—it is an estimate 
of the chance that an individual in a given class is “worse off” than 
an individual in the reference class. For example, the average ridit 
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for drivers with passengers (who are not ejected) in ‘“‘moderately 
severe” acciderrts is 0.27. This means that the probability that an 
individual in the above class will sustain a worse injury than an 
individual in the reference class is estimated as 0.27. It may be easier 
to think in terms of odds rather than in terms of probabilities. An 
alternative statement would be that the chances are about 3 to 1 that 
such drivers will sustain a lesser injury than individuals in the reference 
set. Hence, the average ridit not only tells us that individuals are 
better off than individuals in the reference class, but it also indicates 
how much better off they are (and in a way that is fairly easy to under- 
stand). 

This interpretation can be formalized in the following way. We 
can imagine that an individual is selected at random from a given class 
and that a second individual is selected at random from the reference 
class. We then can compare the two individuals selected and see which 
individual falls into a higher degree of the injury category. This 
operation would be repeated and the relative frequency of cases where 
the individual in a given class is worse off than the individual in a 
reference class is the probability with which we are dealing. It is 
important to distinguish this probability from the probabilities as- 


TABLE 2 
AVERAGE Rivits 
Accident Severity 
Seated Position Minor Mod. Mod. Sev.| Severe |Ext. Severe 

Non-Ejected Occupants 

Driver Alone (.36) .40 .46 .63 (.81) 
Driver W/Pass. .18 .47 (.66) 
RF and CF .22 .29 .38 .52 (.64) 
Rear Seat 19 .19 .23 .34 
Unusual 24 (.46) (.15) (.63) (.22) 

Ejected Occupants 

Driver Alone (.50) (.51) .59 (.74) (997) 
Driver W/Pass. (.12) (.63) .56 (.55) (.72) 
RF & CF (.27) (.47) 54 61 (.50) 
Rear Seat (.48) (.25) (.38) (.40) (.66) 


Note: Those categories containing less than 30 cases are in parentheses, 
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sociated with significance tests or confidence intervals. The confidence 
interval on the average ridit involves the probability of a probability 
statement being true—hence a “higher level’’ probability. (For a 


discussion of the rather involved questions concerning the hierarchy 


of probabilities, see The Theory of Probability by Hans Reichenbach. [4]) 

It will be observed that the non-ejected “drivers with passenger’ 
and “right front and center front seat”? occupants in severe accidents 
have average ridits of 0.47 and 0.52 respectively. The combination of 
these two categories constitutes the reference set for the ACIR study 
and has a ridit of 0.50. The average ridit for the identified distribution 
will always be 0.50, apart from rounding errors, by the definition of a 
ridit. In other words, an individual picked at random from the reference 
class has the same chance of being worse off (or better off) as another 
individual selected at random from the reference class. It will be seen 
from Table 2 that the average ridits in classes other than the reference 
class are sometime much smaller and sometimes considerably larger 
than 0.50 so that the individuals in the different classes have very. 
different risks of injury. We will consider these relative risks in more 
detail in a subsequent section. 

For quick and easy exploration of data we can take advantage of 
the fact that the variance in a given category is rarely much greater 
than that of the rectangular distribution (i.e. 1/12). We can put rough 
95% confidence intervals on ridit means by adding (subtracting) 1/-+V3N 
(where N is the number of observations going into the mean). This 
eliminates the sum of squares operations but at the same time (as will 
be explained below) may lose information of some importance. 

The variances in a class can also be calculated in the usual fashion 
[with the denominator (n — 1)]. In the reference class the value 0.078 
is obtained. In general, for the reference distribution the value will 
be close to 1/12 or 0.083 unless most of the cases are concentrated in 
one or two of the degrees of injury categories. For classes other than the 
reference class, the variances may be substantially smaller or a little 
larger than 0.083. However, as can be seen from Table 3, none of the 
variances is more than one and a half times as great as 0.083. There 
are, however, several variances which are less than one-half of the 
theoretical variance 0.083. In general, it can be said that the ridit 
transformation acts to stabilize the variance (i.e., renders them more 
nearly uniform). However, the stabilization as does occur is a by- 
product, rather than the primary purpose of the transformation and, 
indeed, a complete stabilization is not achieved. 

In interpreting the variances, we must remember that we are not 
dealing with a genuine measurement situation and, in particular, we 
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TABLE 3 
VARIANCES 
Accident Severity 
Seated Position 
Minor Mod. Mod. Sev.| Severe |Ext. Severe 
_Non-Ejected 
Driver Alone (.0299) .0348 .0438 .0657 (.0372) 
Driver W/Pass. .0122 .0311 .0522 .0773 (.1234) 
RF & CF .0456 .0414 .0568 .0801 (.1098) 
Rear Seat .0353 .0317 .0509 .0761 -—- 
Ejected 
Driver Alone (.0890) (.0872) .0872 (.0840) (.0072) 
Driver W/Pass. (.0080) | (.0751) .0997 | (.0989) | (.1115) 
RF & CF (.0330) (.0501) .0827 .1129 (.0810) 
Rear Seat (.0956) (0) (.0910) | (.1125) | (.1239) 


Number of Occupants 


Accident Severity Minor Mod. Mod. Sev.| Severe |Ext. Severe 


Seated position E E|E E|E E| E E| E E 


Driver Alone 6 19 | 10 81 | 30 102 | 24 46 | 6 7 
Driver W/Pass. 3 37 |12 179 | 33 210) 26 76:| 7 11 
RF & CF 5 43 | 27 205/51 241/38 103] 9 13 
Rear Seat 3 37; 1 121)12 8 58 | 6 0 


Note: Those categories containing less than 30 cases are in parentheses. 


are not dealing with normal distributions. In many measurement 
situations, partition of a given body of data into classes will have very 
little effect on the within class variability (this fact being exploited in 
analysis of variance by “‘pooling”’ of the within-class variances). For 
example, suppose we were to measure the yield (of some compound) 
in a chemical process where the temperature and pressure can be con- 
trolled. At the different temperature and pressure conditions used, 
the average yield might be considerably different, but nevertheless, 
repeated measurements of the yield at a specified temperature-pressure 
condition might be nearly normally distributed and the variances (or 
scatter) of the repeated observations might be about the same even 
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though the particular temperature and pressure conditions were quite 
different. 

In the ACIR data a very different situation occurs. In the reference 
set there is a central peak so that the distribution somewhat resembles 
the normal distribution (apart from some excess in the “fatal’”’ category). 
However, for non-ejected drivers with passengers in ‘“‘minor’’ accidents 
there is a very different sort of distribution—one which looks somewhat 
like an exponential decay curve. That is, there are 28 occupants with 
no injury, 8 occupants with “minor” injuries, and one occupant with a 
“moderate” injury. It is therefore easy to see why the variance in 
this particular class is so small. In contrast, the non-ejected drivers 
with passengers in “extreme” accidents exhibit a peculiar bimodal 
(U-shaped) distribution with occupants only slightly injured and 
occupants who are badly injured and, it so happens, no occupants in 
some of the intermediate categories. The variance therefore provides 
some information—at least a warning—about the frequency distribution 
within a class. 

From the above discussion it is apparent that for subjective scales 
the analytic tool should preferably be distribution-free (or non-para- 
metric) in so far as possible. Ridit analysis does have this desirable 
feature. Note that the probability statement associated with the 
average ridit is a distribution-free statement. 

Now we must consider the interpretation of a confidence interval 
about the mean. The confidence interval is readily interpreted as a 
confidence interval on the probability or odds. For example, non- 
ejected drivers with passengers in ‘“‘moderately severe” accidents had, 
as we have seen, an average ridit of 0.27. The corresponding confidence 
interval is 0.24-0.30. Hence we could say, informally, that the chance 
that an individual in this class is better off than an individual in the 
reference class is greater than 2 to 1 and less than 4 to 1. 

So far we have interpreted the average ridits in terms of the refer- 
ence class. Suppose that we have two classes (neither of which is a 
reference class) and we wish to compare the average ridits. For example, 
suppose that we wish to compare center and right front seat occupants 
in ‘‘moderately severe’ accidents who are ejected with the correspond- 
ing individuals who are not ejected. The respective average ridits are 
0.54 and 0.38 and the difference is 0.16. What can we say about the 
situation in these two classes with respect to each other (i.e. without 
involving the reference class)? An estimate of the corresponding 
relative probabilities for the two classes can be obtained very simply 
by adding 0.50 to the numerical difference. Here if we add 0.50 to 0.16, 
we obtain 0.66. In terms of odds this would mean that the chances 
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are about 2 to 1 that the ejected occupant will sustain a worse injury 
than the corresponding non-ejected occupant. The rule given here is 
an approximate one which eventually breaks down if the differences 
are close to 0.50 (or larger than 0.50). However, if in a particular 
instance a better estimate is required, this estimate can be obtained 
by setting up a new system of ridits using one of the class to be compared 
(ordinarily the class with the larger number of individuals) as the new 
reference class. 


5. Using Ridits to Examine Relationships 


In the preceding section we have considered the meaning of. the 
average ridit in a given category and its confidence interval (i.e. one 
isolated category). This may have seemed somewhat academic to a 
working scientist. After all he is likely to be primarily interested in the 
underlying relationships between the factors in the study and the response 
variable. Unless confidence intervals or significance tests can lead to 
useful conclusions about the phenomena under study, the working 
scientist has no interest in them. Perhaps the best way to show that 
average ridits (and confidence intervals) can be of value to a working 
scientist is to exhibit them at work in an actual study. This requires 
something more than going through the mechanics—it is necessary to 
show that the procedure can lead to conclusions which are meaningful 
and important in the scientific field of study. 

Before proceeding with the ACIR example let us stop to consider 
what the analyst faces in this data: 


(1) A massive series of 2253 observations (from 1000 accidents 
collected during the period September 1953 to July 1955). 

(2) Alarge number of categories (45) with a variety of inter- 
comparisons. 

(3) Unequal numbers of observations in the categories (one has 245, 
another is empty). 

(4) Three factors (ejection, seated position, accident severity) 
which happen to have a strong influence on injury and which 
have complex interactions. 

(5) Several artifacts in the data (see limitations of the data in [5)). 

(6) The task is not just to clarify relationships. The goal is to 
learn how to improve the engineering design of automobiles 
so as to cut down on the injuries in auto accidents. 


Although the above situation is a rather involved one, it will be 
attacked with a very simple statistical tool: a graph of average ridits 
and their confidence intervals. The following text—which is quoted 
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almost verbatim from the original report [5]—will show how ridit 
analysis reduces the data to easily comprehensible form and suggests 


CHART 1 
“iti CONFIDENCE INTERVALS ON THE AVERAGE RIDIT POR NON-EJECTED 
= OCCUPANTS BY ACCIDENT SEVERITY AND SEATED POSITION 


DEOREE OF INJURY (RIDIT) 


20 5 
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ACCIDENT SEVERITY 


Explanation of Chart 1 
Legend: Drivers alone (diagonal lines), drivers with passengers (dotted lines), right front and center 
front seat occupants (criss-cross lines), rear seat occupants (horizontal lines). 

The heavy black line at 0.32 is the average ridit for all the non-ejected occupants. The dashed 
horizontal lines represent ridits of 0.25, 0.50, and 0.75. The black line in the center of each confidence 
interval is the average ridit for that group. 

5 Note: The average ridit and confidence interval for the rear seat occupants is missing in the 
extreme accident severity category as there are no cases in this group. 
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ways to deal with the serious public health problem of auto accidents. 
Accident Severity: Chart 1 (non-ejected occupants) indicates 


CHART 2 
CONFIDENCE INTERVALS ON THE AVERAGE RIDIT FOR BJECTED 
OCCUPANTS BY ACCIDENT SEVERITY AND SEATED POSITION 
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Explanation of Chart 2 
Legend: Drivers alone (diagonal lines), drivers with passengers (dotted lines), right front and center 
front seat occupants (criss-cross lines), rear seat occupants (horizontal lines). 
The heavy black line at 0.54 is the average ridit for all the ejected occupants. The dashed hori- 
zontal lines represent ridits of 0.25, 0.50, and 0.75. The black line in the center of each confidence 
interval is the average ridit for that group. 


N i= — = eS 
NS = = — 
N = = = 
| N = = = 
= = = 
= 
= 
= 
= 
|= 
— 


30 BIOMETRICS, MARCH 1958 


ie a clearcut relationship between accident severity and injury severity 
; (as might be expected from common sense). For drivers alone, or with 
passengers, and for center and right front seat occupants, the average 
d ridit increases rapidly with increasing accident severity and the separa- 


CHART 3 
CONFIDENCE INTERVALS ON THE AVERAGE RTDITS FOR DESIGNATED 
OCCUPANTS (DRIVERS WITH PASSENGERS, CENTER FRONT AND RIGHT 
FRONT SEAT OCCUPANTS) BY EJECTION STATUS AND ACCIDENT SEVERITY 
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Explanation of Chart $ 
Legend: Ejected occupants (dotted lines), non-ejected occupants (criss-cross lines). 
The heavy black line at 0.32 is the average ridit for all the non-ejected occupants considered 
here, and the light black line at 0.55 is the average ridit for all the ejected occupants considered here. 
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tion of successive confidence intervals clearly shows that this is not a 
matter of sampling variation. For the rear seat occupants the average 
ridits in the ‘“‘moderate”’ and “‘severe’’ categories have non-overlapping 
confidence intervals. However, the increase in the average ridit as 
accident severity increases is less marked for rear seat occupants. 

For the ejected individuals, the relationship between accident 
severity and injury is less apparent (Chart 2). For the drivers with 
passengers and the front seat occupants, the average ridit in the 
“minor” accident severity category is significantly lower than the 
corresponding average ridits in the ‘‘moderate”’ or worse accident 
severities. However, apart from the “minor” accident severity, there 
is no clearly discernible pattern (it must be remembered that the 
number of cases is relatively small). 

Ejection: Chart 3 compares ejected and non-ejected drivers 
with passengers and front seat occupants (combined) for various 
accident severities. It will be observed that in the “moderate” and 
“moderately severe’ accidents, the ejected individuals are much worse 
off than the corresponding non-ejected occupants. As the accident 
severity increases, the differential becomes smaller and in the ‘‘extreme”’ 
accidents, the individual tends to be badly injured whether or not he 
stays in the car (the average ridits are approximately equal). In the 
“minor” accidents, ejected and non-ejected occupants also have similar 
ridits. 

The ‘moderately severe’ and ‘severe’ accident categories are 
especially important in this study because it is in these categories 
where presumably automotive redesign could do the most good. In 
the “minor” accident severity, the injuries may be painful but are 
rarely dangerous to life. The ‘‘extreme’’ accidents are relatively rare 
(less than 3 per cent of the occupants are in the “‘extreme’’ accidents), 
and the force conditions are such that safety features may have limited 
value. 

In the case of ejection, there is a specific and obvious component 
of the car implicated, namely, the door lock. It should be remembered 
that this data deals with pre-1956 automobiles. (Note: Redesigned 
door locks have been used in post-1956 automobiles.) 

Seated Position: Because of the marked influence of ejection, we 
shall confine our remarks about seated position to the non-ejected 
individuals. From Chart 1 we can see that the “drivers alone” are 
worse off than the occupants in other seated position categories. How- 
ever, because of an artifact, we will pass over the “driver alone’’ category. 

The right front and center front seat occupants are significantly 
worse off than the drivers with passengers as long as the accident 
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severity is ‘moderately severe” or less. However, this differential 
tends to disappear in the more severe accidents. The curve for the 
rear seat occupants crosses the curve for drivers with passengers, being 
higher (though not significantly so) in ‘minor’ accidents and lower 
(though not significantly so) in ‘severe’ accidents. In the ‘moderately 
severe” and “severe’’ accidents (which represent the crux of the injury 
problem), the rear seat occupants are significantly better off than the 
center and right front seat occupants. 

The implications to be drawn concerning automotive structure 
are necessarily indirect. However, we can at least make some educated 
guesses from these findings. For example, the very favored position of 
the driver in minor accidents is probably associated with the steering 
wheel. With the aid of the steering wheel, the driver is able to maintain 
his position in his seat whereas the other occupants may be displaced 
from their seats even in these mild force conditions. 

In more drastic force conditions, the driver loses his advantage 
over the other front seat occupants. There are two plausible expla- 
nations and both may hold. First of all, the steering wheel may no 
longer serve to retain the driver in his seat when the force conditions 
are more severe. Second, the steering wheel may itself produce injuries. 
These explanations may become more clearly distinguished when data is 
accumulated on the new model cars with redesigned steering wheels. 

The experience of the rear seat occupants is very interesting and 
it is unfortunate (for this study, if not for the occupants) that there is 
no information about non-ejected rear seat occupants in “extreme” 
accidents. The favorable status of rear seat occupants in “severe’’ 
accidents has a hopeful implication. A main distinction between the 
environment of a right front seat occupant and a rear seat occupant 
is that the former faces the instrument panel while the latter faces a 
seat back. In any event, the fact that the rear seat occupant is so 
markedly better off than the right front seat occupant in “severe’’ 
accidents strongly suggests that the redesign of the automobile interior 
could appreciably reduce the present injury and death toll. 

Interactions: An “interaction”? between variables means that the 
effects of the variables do not “add up” in a simple fashion. These 
particular data have numerous such interactions, as has been pointed 
out above. The points we wish to make are (1) that the interactions 
may be of very great interest and (2) that even the simple ridit analysis 
used here can provide a fairly clear picture of these interactions. 

If we look at variables one at a time we are likely to miss these 
interactions altogether. For example, suppose that we were to study 
the relationship between accident severity and injury without distin- 
guishing seated position or ejection status. The effect would be to 
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miss the point that the relationship between accident severity and 
injury is quite different for the ejected as opposed to the non-ejected 
occupants. 

Similarly, if we were to study seated position without regard to 
accident severity, we would miss the important point that the differ- 
ential between driver with passengers and center and right front seat 
occupants is marked in “minor” or ‘‘moderate” accidents and tends 
to disappear as the accident becomes more severe. 

The routine application of a statistical method such as the chi- 
square test may obscure the interactions even when the variables are 
studied simultaneously. 


6. Special Problems with Borderland Variables 


All scientists who work with “borderland”’ variables have a heavy 
cross to bear. Sooner or later someone (perhaps a physical scientist) 
dismisses the data—no matter how carefully it was collected or how 
extensive it may be—with a shrug and in offhand remark to the effect 
that after all it is “subjective.” 

There is an important point which is not always realized by those 
critics who see the word “subjective” as a severe condemnation of any 
body of data. All observations—including those in the physical sciences— 
necessarily involve an observer and are, strictly speaking, subjective. 
So “objectivity” and “subjectivity” is a matter of degree rather than 
kind. There is a fairly close correspondence between the complexity 
of the task performed by the observer and the degree of subjectivity. 
If a physical scientist is working at the utmost limits of his measuring 
instruments so as to detect very small effects his results will depend 
largely on his technical skill. Consequently different scientists will 
report different results (i.e. the measurement system will show the same 
sort of space-time instability that is encountered with subjective scales). 

There are various straightforward ways of improving the space- 
time stability of a subjective scale, (essentially by simplifying operations 
so as to reduce the element of personal judgment to a minimum) but 
in practice we can go only so far along these lines. We will then face 
the choice of using a scale which is not as stable as we might like or 
of not studying the phenomena at all. However, with reasonable care 
and a judicious choice of statistical methods we can minimize the risks 
inherent in working with “‘borderland”’ variables. 

To see more clearly how ridits can be used with subjective scales let 
us consider very briefly some peculiarities of these scales. The units 
(classifications) of a subjective scale are not likely to be ‘equal in 
length” like the units of a pocket ruler. Instead the subdivisions of a 
subjective scale tend to be determined by the ability (or supposed ability) 
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of the individual making the rating to make distinctions between the 
subjects to be rated. Consequently investigators at two different 
institutions, approaching the same phenomena, such as injury, might 
use very different scales. One person might use a 5-point scale, another 
a 15-point scale. The same investigator may, from time to time, wish 
to consolidate or elaborate the subdivisions of his subjective scale. 
Now let us suppose that two investigators (using scales with different 
numbers of subdivisions but otherwise consistent with each other) rate 
the same two sets of subjects (i.e. a “‘reference” set and an “other” set). 
Both apply ridits to their data and then the two investigators compare 
results. They will find very nearly the same average ridit in the ‘‘other”’ 
set! 

Or consider another pecularity of subjective scales, one which can 
occur even when two investigators use the same number of subdivisions 
and the same name for the subdivisions. It is characteristic of sub- 
jective scales that a kind of “slippage’”’ will take place—the borderline 
between two categories will not coincide for the two investigators. A 
class of cases will exist which one scientist would rate as “minor” but 
which the other rates as “moderate.” Yet if the two investigators rate 
the same two series of subjects and then use ridits their results will be 
in good numerical agreement! 

These and similar properties of ridits are not very surprising when 
the meaning of the average ridit is recalled. Both investigators are 
estimating the same thing: the chance that an individual in the ‘‘other”’ 
series is worse off than an individual in the reference set. The effects of 
consolidating or elaborating categories or slippage are similar to those in 
the estimation of the mean of a distribution from grouped data. 

For numerical “borderland”’ variables the ridit transformation also 
has interesting properties. For example, suppose that one investigator 
likes to work with tumor volume while another prefers a “linear” 
measure (i.e. the cube root of the volume). The ridit analysis of a body 
of data will be the same whether it is reported in volume or “linear’’ 
measurements (ridits are invariant under a monotone transformation 
of the original variable). 

I should not like to give the impression that ridits are a panacea 
for “borderland” variables—no analytic trick can solve some of the 
deep-rooted problems in this area. But ridits do have certain properties 


that may sometimes improve the space-time stability of “borderland” 
variables. 


7. Graphical Presentation 


Ridit analysis lends itself nicely to graphical presentation. Since 
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ridit averages (and confidence intervals) must lie between zero and one 
the vertical axis of the graphs will be simple and standard (this rules 
out misleading graphs where the vertical scale is stretched or con- 
tracted to “prove” some point). A surprising amount of information 
can be put on a single graph (as in Chart 1). 

There is one simple trick which can add even more information to 
the graph. The original scale can be indicated on the right hand vertical 
boundary of the graph (for example the 0.25 line here would be labeled 
“minor”). This allows an interpretation of the results in terms of the 
original scale which is often useful (but some caution is needed here). 

In the ACIR study the ridit graphs were used to study three factors 
simultaneously with respect to a single response variable. A similar 
graphical approach can be effective in many other situations. In an 
epidemiological study of mental health there were hundreds of factors 
and the first task was to screen out the important ones. A series of 
graphs of the factors (individually) vs. the response variable (a ridited 
mental health rating) indicated which were “‘live’’ factors and also gave 
an indication of the relative importance of the factors. The shortcut 
version of ridit analysis is very helpful in applications of this kind. 

In retrospective studies the role of stimulus and response variables 
are just reversed (i.e. the factors are treated as if they were response 
variables). Here fixed groups (‘‘study” and “control” series) are 
compared with respect to a large number of different “response” 
variables. Because the “‘response’’ variables are measured on completely 
different scales, it is rather hard to judge their relative importance when 
the original scale is used (some scales are graded categories; the numerical 
ones have a variety of units). By putting all of the “response” variables 
on zero-to-one ridit scales, a more direct comparison is possible [6]. 

Another interesting application of ridits occurs when there are 
several different response variables, which are all indices of the same 
thing (say the potency of a drug). For example, in a laxative trial two 
dissimilar indices (number of movements, consistency of movements) 
were employed. Putting both indices on ridit scales made it possible 
to compare directly the results for the two indices (see [7]). 

The graphical approach using average ridits and their confidence 
intervals is free from nearly all of the usual objections to graphical 
methods. The investigator will not be fooled by sampling variation 
and the special properties of the ridit scale are an extra safeguard. 


8. The Reason Why 


Whether or not ridit analysis will work (in the sense of producing 
useful results) is something that can only be determined by actual 
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trial. The question of whether ridit analysis will work in a technical 
sense is one which requires some mathematics to answer [1]. However 
for the benefit of technically-minded readers (and referees) a few com- 
ments are in order. 

Since the general properties of the t-test family of techniques are 
well known, the simplest approach is to regard ridits as an arbitrary 
scale and to ask how the properties of this scale differ from those of 
familiar measurement scales. For the identified distribution the 
mechanics of ridit analysis imposes a strong restriction. No matter 
what the nature of the original observations may be the distribution of 
the ridits is going to be closely approximated by a theoretical curve 
(the “rectangular distribution’). The sole exception occurs when 
nearly all of the observationssfall into one or two categories (in which 
case a correction is needed to reduce the variance and the approxima- 
tion is poorer). 

First let us consider the situation where, apart from sampling 
variation, there is nothing at all going on in the data. To make matters 
even simpler let us suppose that the identified distribution is so large 
that it accurately represents the population from which it (and the other 
categories) was drawn. Then the distributions in the other categories 
can also be well approximated by the rectangular distribution. The 
average ridits are simply means of samples drawn from the rectangular 
distribution. The expected value of each mean is 1/2, the variance is 
1/12N, and the means are nearly normally distributed even in small 
samples. The shortcut confidence intervals (average ridit + 1/V/3N) 
apply and provide 5% level significance tests (i.e. inclusion of the 
value 1/2). 

Now change the situation so that the identified distribution departs 
somewhat from the true distribution. In this case, in terms of the 
observed ridits, the true distribution will be a slightly distorted version 
of the rectangular distribution (provided the reference set is moderately 
large—as it would be in practice). The average ridit for the true 
distribution will not necessarily be 1/2 but it has a 95% chance of 
falling in the shortcut confidence interval for the identified distribution. 
The other categories represent samples from this slightly distorted 
rectangular distribution whose variance would be close to 1/12 (ordi- 
narily a little smaller) and whose other properties would be only slightly 
changed (for example, the standardized fourth moment, 8, would be 
less than 3, etc.). Confidence intervals and tests based on the shortcut 
would be good enough approximations for practical purposes. Note 
that so far no assumption has been made about the original observations 
(except that if they are like dichotomous classifications a correction 
to the shortcut intervals is needed). 
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Let us turn to the situation—of much greater scientific interest— 
where something 7s going on in the data. The identified distribution 
is still rectangular but the distributions in the other categories may be 
quite different. However, our arbitrary ridit scale—unlike familiar 
measurement systems—imposes certain theoretical restrictions. The 
average ridit must fall in a closed interval rather than in an infinite 
range. The variance cannot be greater than 3/12 and must be still 
smaller if the expected value departs from 1/2. In actual studies 
variances greater than 1/8 are very rare. So the crude assumption that 
the distributions in the other categories will have properties similar to 
the rectangular distribution (and hence the use of the shortcut) works 
out fairly well in practice. However, to avoid this assumption I would 
suggest the use of direct estimates of variances in the calculation of 
confidence intervals or significance tests (i.e. standard t-test family 
procedures). 

The usual two-tailed statistical procedures at the 5% level (in- 
cluding the confidence intervals in the ACIR example) are rather 
insensitive to the distribution of the observations. Apart from in- 
dependence of the observations the only distributional property likely 
to cause serious trouble is the occurence of erratic or outlying obser- 
vations (which, technically, would be reflected in 8,). Now “border- 
land” variables do have a tendency to be erratic but, as can be seen 
at once from the mechanics of ridits, the unruly observations can have 
little influence on the ridit scale. It may also be noted that, because of 
the restrictions on ridit variances, the direct estimates of variance are 
more precise for ridits than they are for the usual unrestricted variables. 
So while the application of standard methods to ridits provides valid 
confidence intervals (in the sense that the probability levels are less 
than or equal to the nominal 5% level) the procedure may be over- 
conservative. 
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THE k-VISIT METHOD OF CONSUMER TESTING 


Grorce E. Frrris* 
Biometrics Unit, Cornell University 
Ithaca, New York, U.S.A. 


I. Introduction 


When a pair of products, A and B, is being consumer-tested for 
preference, it is often found that the results of a single interview are 
insufficient to determine the attitude of the consumer population with 
respect to the products. At the interview it is usual to ask the consumer 
to indicate preference for one of the two products, or to declare no 
preference. The proportion of no preference votes is particularly 
dependent on the phrasing of the questionnaire or emphasis of the 
question. In addition, the information contained in the no preference 
(or tied) votes has often been wasted, misused, or misinterpreted, in 
distributing these votes in order to reduce data to two figures: the 
preferences for A and B. 

The problem is formulated below as a problem in estimation. A 
model for the behavior of the consumer population is proposed, and the 
parameters of the proposed model are estimated. The variance- 
covariance matrix of the estimators is obtained. A check on the ap- 
propriateness of the model is indicated, and some common departures 
from the model are mentioned. Finally, the method is illustrated with 
some numerical examples. 


II. Formulation of the Problem 


The manufacturer or research worker wishes to compare two 
products, A and B, as regards consumer preference. Both products 
may be experimental, or one may be currently produced, or they may 
be the products of rival firms. 

Let z; be the proportion of the population with a real preference 
for the product 7, where 7 = a, b; and zp» will be used to denote the 
proportion of those in the population who either cannot discriminate, 
or have no preference. It follows that 0 < 7; < 1,7 = a, b, 0, and 
T +m + = 1. 


*Now at General Foods Corporation, Tarrytown, New York. 
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If we knew the z; , our course of action would be as follows: 

1. When a» > 70(0), where z,(0) is fixed by economic considerations, 
we would market the less expensive of A and B only (or the one that is 
already in production, or the one more convenient to produce). This 
situation is represented by Region I in Diagram 1 below, and occurs 
when there is a high proportion of non-discriminators, or of consumers 
with no preference, in the population [0 < 2,(0) < 1]. 

2. When a < 2(0), and x,/7, > c, , where c, is some positive 
constant determined by the economist, product A only would be 
marketed. Region II in Diagram 1 represents this situation; viz. 
when enough people have a preference, and of them a sufficient majority 
prefers product A. 

3. Similarly, when 7) < 2,(0), and m,/m, < c. , where c, is some 
positive constant (< c¢,) prescribed by the economist, product B only 
would be marketed. Region III in Diagram 1 below represents this 
situation; viz. when enough people have a preference, and of them a 
sufficient majority prefers product B. 

4. Finally, when 2) < 2,(0) and c, < 7,/m, < c, , Where c, and c, 
are as in 2 and 3 above, both products A and B would be marketed, 
because we are then in Region IV of Diagram 1 below, and this means 


DIAGRAM 1 
REGIONS FOR PROCEDURES 
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that enough people have a real preference, and they are not far from 
equally divided in preferring products A and B. 

The problem is to estimate the point in Diagram 1 corresponding 
to (7. , ™ , To) together with its confidence region, to enable one of the 
procedures 1. — 4. to be chosen. 


III. The Model 


Let the consumers who really prefer A, really prefer B, and those 
who cannot discriminate or have no preference, be referred to as con- 
sumer types a, b, and 0, respectively. 

Let P;(j) denote the probability of a consumer type z (7 = a, b, 0) 
voting for product j on any particular occasion when the pair of products 
A and B is presented; j = A, B, or 0, and the 0 denotes a no preference 
vote. 

It is assumed: P,(A) = P,(B) = 1 

P,(B) = P,(A) = P.(0) = P,(0) = 0 


P,(0) = 1— 2p 
= P,(B) = p, where 0 < p < 1/2, 


ll 


i.e. it is assumed that those who really prefer A (or B) will consistently 
so vote each time they are confronted with the choice, and that those 
with no consistent preference (who include non-discriminators, those 
who are indifferent in their preference, and others who think they have 
a preference at any given time) will vote sometimes for A (with prob- 
ability p), sometimes for B (with probability p), and will admit to 
having no preference the rest of the time (probability 1 — 2p). These 
assumptions possibly give a very idealized picture of the behavior of 
the consumer population, because it is quite conceivable that in those 
cases when A and B differ only slightly each of our respondents should 
be characterized by a parameter representing the long-term proportion 
of times he will vote for A, say. This is the parameter held fixed at 1 
or 0 above. The model discussed here is considered realistic as an 
extreme case of the more general model, one which would be much 
more difficult to handle because it needs to assume further some kind 
of a distribution (as regards the long-term parameter) of the respondents. 
One outcome of the assumption is that the category described as type 
0 of consumers is very inclusive, and hence the corresponding estimate 
must be interpreted with care. 


IV. Data Required 


A representative sample of size N of the consumers is selected. 
Each member of this panel of N is then either visited twice or asked to 
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judge twice the same pair of products A and B. Naturally, they are 
not told that they are judging the same pair on both occasions. They 
are simply asked to indicate a preference for one of the two products, 
or to vote ‘‘no preference” on each occasion. The usual precautions are 
taken to control code bias and order of presentation effects by balancing, 
and the pair presented the first time bears different coding from that 
presented the second time. 

The N judges can then be classified into nine categories as shown in 
Table 1 below. Table 1 also shows the number of judges expected to 
fall into each category under the model assumed. 


TABLE 1 
Category | Description of Judges in This Category | Observed | Expected No. 
No No. 

1 Those voting for A both visits Nae N(aa + p20) 
2 Those voting for B both visits Nw N( + 
3 Those voting A first visit, B second visit Nap Np*xo 
4 Those voting B first visit, A second visit Noa Np*o 
5 Those voting A first visit, no preference 

second visit Nao Np(1-2p)xo 
6 Those voting no preference first visit, A Noa Np(1-2p)70 

second visit 
7 Those voting B first visit, no preference 

second visit Noo Np(1-2p)xo 
8 Those voting no preference first visit, B 

second visit No Np(1-2p)o 
9 Those voting no preference both visits Noo N(1-2p)*2o 


The procedure described so far applies to the 2-visit method. The 
method extends readily to the case k > 2 (see IX below). In the case 
of home use tests there would be no difficulties encountered. The 
interviewer would leave one pair of samples A and B at the first visit 
with the N members of the panel, N/2 members being asked to judge 
in the order AB, the other N/2 in the order BA; also, N/2 of them 
would receive A with the lower code number, N/2 with B as the lower 
code number. At a later visit the interviewer would leave another 
pair of samples A and B, coded differently, with the same N judges, 
both the requested order of judging and the coding being reversed with 
respect to A and B this time. There is no reason why the panel should 
not judge other products between the two visits, provided the interval 
between the two visits with A and B is not so great as to permit members 
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to change preferences owing to some factor such as overfamiliarity 
with the product. In store testing the judges cooperating are asked to 
evaluate one pair of A and B at one end of a long counter, the second 
pair (coded differently) at the other end. The usual balancing of code 
numbers and order of presentation is still possible. 


V. Estimation of the Parameters 


Let the maximum likelihood (M.L.) estimates of the 7, (¢ = a, b, 0) 
and p be denoted #; (¢ = a, b, 0) and #, respectively. 
The likelihood function is written: 


L = (3g + 
Let 
M=N-N.- Nw 
N, = Na + Noo 
N, = Nao + Nos + Noo + Nos - 


Then, after taking logarithms of both sides, and partially differ- 
entiating with respect to p, 7, , and 7, , in turn, and equating to zero, 
three equations in three unknowns: #, #, , and #, , result. On solving 
these, one obtains: 


M+ ur (We Neon, + N,) 
p= 


2Noo + Ny 
— p) — (NW Nw)p 
Nd — 26) 
N(1 — 2°) 
M 


All three solutions for # naturally satisfy the equation from which 
they were derived. But for the overall least squares problem the 
solution with the minus sign is the one usually wanted. The solution 
with the plus sign usually corresponds to a maximum, or to a value of 
p outside the range 0 < p < 1/2. The solution # = 1/2 is usually a 
point of inflexion, but becomes a minimum in certain extreme cases, 
and in these cases the other solution is indeterminate in form. In this 
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last-named case the model is usually a bad fit, but putting # = 1/2 
gives the best possible fit of a poor lot. 
VI. Errors in Estimation 


There is no direct interest in the estimation of p. By substituting 
the expressions [see Table 1] for the expected values of N,. , Ny» , in the 
formula for #, , it is noted that the expectation of #, is obtained as 7, , 
irrespective of the value of # because all terms containing # cancel. 
Similarly, z, is the expected value of #, , 7 of # . 

Whereas the variance-covariance matrix of the #; is hard to write 
down directly, the variance-covariance matrix of the q,, (m = 1, 2, 3, 4, 5) 
below, can be written directly, where: 


h = ft. + pho 

= + pho 

Qs = 4p(1 — 2p) fo 
qs = (1 — 29)"# 


and these can be regarded as multinomial probabilities. Once one has 
obtained these, however, one can use the relations: 


1 


ft, = G2 — 39 
fio = + + Os 
to obtain: 
2 
Ta(1 /2 = 3 Top = 
2 2 
m(1 — m,) + 2rop + Toto 
N N 
2 2 


In practice the parameters on the right sides of these expressions 
would be replaced by their estimates. 


VII. Testing the Model 


An inspection of Table 1 together with the formulas of Section V 
indicates that the M.L. estimates #, , #, , are such that the observed 
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and expected values of N.. , Ni, , agree perfectly. The fit of the model 
may, however, be checked by the usual x” goodness of fit test on cate- 
gories 3.-9., as the expectations in these categories merely arise from 
a multinomial with the parameter p estimated from the data, and the 
totals made to agree by multiplication by the factor Nz , so that the 
x’ obtained by summing (Observed-Expected)’/Expected terms over 
the last seven categories of Table 1 has five degrees of freedom. 

Thus, the appropriateness of the model is tested by the fit in these 
seven categories. Some common departures are discussed below. As 
will be seen, in certain extreme cases negative estimates of m, , m , can 
arise, and their confidence range may not overlap zero. The second 
departure discussed in VIII below shows that the method of interviewing 
rather than the model may be at fault. 

There is an unpublished manuscript by R. A. Bradley of the Virginia 
Polytechnic Institute which describes a Modified Duo-Trio Procedure. 
This consists of presenting to consumers a sample of products A and B, 
asking them for their preference, if any, and having recorded their 
vote, one presents to them a third sample which may be A or B. They 
are told that this third sample is identical to either the first or the 
second which they tasted. They are then asked to match it. Obviously, 
their preference was given and recorded before their success or failure 
to match correctly could psychologically influence them, whether they 
are informed of success or failure or not. One may then tabulate the 
preferences of those who matched correctly and also of those who did 
not. On further allowing for some of those who matched to have done 
so by pure chance, and removing them, one can obtain estimates which 
correspond to #, and #,. This furnishes an indirect way of testing the 
model, viz. estimating #, and #, by both procedures for the same pair 
of products A and B. In one of the illustrations below this was done, 
and the estimates obtained agreed well within the respective confidence 
ranges. 


VIII. Departures from the Model 


Two important departures from the model have been encountered 
in practice. The first of these occurred in store tests, and was a fatigue 
phenomenon, viz. the observed values No , Noo , were significantly 
larger than the observed values of No, , No, . These four numbers have 
the same expectation under the model. The increased number of no 
preference votes on the second pair presented was found due to pre- 
senting it too soon after the first pair, but the model still worked effec- 
tively in estimating the z; , the reason being that in the estimation the 
four quantities involved occur only as the sum N, , so that there is an 
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averaging effect. The second departure from the model occurred when 
a questionnaire was used, which by its wording “forced” an excess of 
the 0-type consumer into the V,, + N,, categories, i.e. it forced them 
to guess and indicate a preference even when they had none (and really 
belonged in the N,» category or one of the N, categories). The combi- 
nation of such a questionnaire with a low z, or z, in the population 
may lead to negative estimates #, or #, . There are several reports in 
the literature of consumer surveys where (by their respective models) 
an excess of reversals of preference (i.e. an excess of N,, + N,, votes) 
is recorded. This can now be explained as due to a large 7, , together 
with a questionnaire demanding a preference, and superposed on this 
is the psychological attitude of the consumer asked to participate in 
the test, who is only too eager to indicate a preference ‘‘to oblige’”’ even 
if he has none. The k-visit method detects this, and leads to good 
estimates of the z; . 


IX. Extensions of the Model 


The 2-visit method has so far been discussed. Extension to k > 2 
visits is easy. Table 2 shows the expected numbers in the 27 categories 
for k = 3 visits. Construction of the likelihood function and derivation 
of estimates of the parameters (no more in number than for k = 2) 
follows exactly as for the k = 2 case, but with an increased amount of 
algebra. 

Extension to more than two products suggests itself. This is very 
much more difficult. Even in the case of three products A, B, C, and 
k = 2 visits, we must introduce additional parameters, to represent 
those proportions of the population who are of the type who prefer 
product A to products B and C, but have no preference as regards B 
and C. It will be found that 13 parameters are necessary to describe 
the population completely, and that 169 categories arise as a result of 
the 2-visit procedure. The 13 types of individuals may be written in 
short-hand fashion as follows in Table 3 below: 


X. Numerical Illustrations 


Example No. 1 


Two orange juices were consumer-tested. The products differed 
in degrees of sweetness and sourness. 900 judgments were obtained 
by the 2-visit method. The data were as shown in Table 4 below. The 
calculated estimates with their standard errors were: 

p = .2870, #, = 49.7 41.7%, #, = 37041.7%, = 13.341.3%. 


The goodness of fit test yields, with 5 d.f., x’ = 4.1 (not significant). 


att 
‘ 


TABLE 2 
CATEGORIES FOR 3-VIsIT SAMPLING 


Category No. | Observed No. Expected No. 
1 Noss (xa + p*ro)N 
2 Novo + 
3 Naed 
4 Nata 
5 
—>p*roN 
8 Nav» 
9 Naao } 
10 Nave 
ll Noaa 
12 Novo 
13 N00 
14 
15 \——>p1-2p)aoN 
16 
17 
18 Nova 
19 Noat 
20 Nov0 } 
21 Naoo 
22 Noao 
23 Nova 
24 N00 
25 Noso 
26 Noob 
27 Nooo (1-2p)?xoN 
TABLE 3 
Tue 13 ConsuMER TyPEs IN THE PoPpULATION IN ResPEctT TO 3 PRopUcTS 
Type No. Description of Type 
1 A>f>€ 
2 
3 B>A>C 6 ways of explicit order of 
4 B>C>A preference 
5 
6 C>B>A 
7 A=zB>C 
8 A=C>B 
9 B=A>C 6 ways of pairwise 
10 B=C>A indifference} 
a1 C=A>B 
12 C=B>A 
13 A=B=C _ complete indifference 


where > denotes “preferred to,” and = denotes “‘equally as preferred as.”” 


47 


| 
| 
a 
2h, 


48 BIOMETRICS, MARCH 1958 


Example No. 2 


Two experimental instant coffees were consumer-tested. The 
products differed in a number of qualities. 450 judgments were obtained 
by the 2-visit method. Table 4 shows the data. The estimates with 
their standard errors were calculated as follows: 


= 4104, #,=11842.7%, =14423%, = 86.8 + 3.9%. 


The goodness of fit tests yields, with 5 df., x° = 3.7 (not significant). 

Examples 1 and 2 were selected as illustrations for an additional 
reason. It is common practice in conventional consumer tests (1-visit) 
to add half the number of no-preference votes to the preferences for 
A and B, and then to form a preference ratio. Now, in Examples 1 
and 2 this ratio is given by 1.29 and 1.23 respectively. Example 1: 
49.7 + 3(13.3) = 56.3, 56.3/43.8 = 1.29; Example 2: 11.8 + 3(86.8) = 
55.2, 55.2/44.8 = 1.23. Yet, the marketing decision made in each 
case would be almost certainly different despite the similarity of these 
ratios. In Example 1 both products would probably be marketed, 
because of the large #, and #, ; in Example 2 only the one currently 
produced or the one easier or cheaper to produce would be marketed, 
because of the large 7, . 


Example No. 3 
Two brands of lemon pie filling were compared in a consumer-test, 


using a panel of 450 consumers and the 2-visit procedure. The data 


TABLE 4 
Noumericat Data For Examp Nos. 1, 2, and 3 


Example No. 1 Example No. 2 Example No. 3 
Category 
Type Observed Expected | Observed Expected | Observed Expected 

Naa 457 457.0 119 119.0 323 323.0 
Now 343 343.0 72 72.0 ' 64 64.0 
Nab 8 9.9 71 65.7 10 12.3 
Noa 14 9.9 62 65.7 13 12.3 
Nao 14 14.6 33 28.7 12 8.2 
No 12 14.6 23 28.7 Ff 8.2 
Noo 17 14.6 32 28.7 10 8.2 
Nov 11 14.6 24 28.7 7 8.2 
Noo 24 > ee 14 12.6 4 5.5 


i 
1 
4 
q 


CONSUMER TESTING 49 


are shown in Table 4. The calculated estimates with their standard 
errors were: 


p = 3743, #, = 69.142.3%, # =11.5418%, # =19.4424%. 


The goodness of fit test yields, with 5 d.f., x” = 3.4 (not significant). 
This last example illustrates the case of a clear-cut majority for one 
of the products. 
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THE DETECTION OF INDIVIDUAL DIFFERENCES IN 
ACCIDENT SUSCEPTIBILITY" 


RosBert FirzpaTRick 


American Institute for Research 
Pittsburgh, Pennsylvania, U.S.A. 


To those who have studied accidents, it has often seemed that 
there are “accident-prone”’ individuals who have more than their 
share of accidents. A more specific hypothesis is that there are stable 
differences in susceptibility to a particular kind of accident among 
individuals exposed to the same risk. If substantial individual differ- 
ences in susceptibility actually exist, approaches to the scientific under- 
standing and to the prevention of accidents should take somewhat 
different directions than they should if these differences are minor. 

Obviously, some members of any group have more accidents than 
others; and there is some reason to think that some people have more 
accidents per unit of hazard than others [e.g., 30]. However, differences 
in accident rates may be due simply to the random nature of the acci- 
dent variables, so that predictions made on the basis of past accident 
records may not be of much value. If, as has often turned out, the 
persons who seem this year to be accident prone do not look that way 
next year, we have not learned very much this year. 

The aim of this article is to review critically the methods which 
have been used to detect differential accident susceptibility. For 
convenience, the methods are discussed under three headings. 


THE UNIVARIATE APPROACH 


Form of the “chance” distribution. <A logical first step, from the 
statistical point of view, in research on individual accident susceptibility 
is to find a satisfactory model to fit the hypothesis of “chance,” or no 
differential susceptibility. If each member of the group has and con- 
tinues to have the same probability of incurring an accident as each 


1This article is based in part upon research conducted under Contract AF 33(038)—13260, sponsored 
by the Human Factors Operations Research Laboratories, United States Air Force. The opinions are 
thoge of the author, and do not necessarily reflect the views of the Air Force. 
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other member, then there is little justification for the further exploration 
of individual characteristics as determiners or predictors of accidents. 

This null hypothesis has been represented, since its original intro- 
duction by Greenwood and Yule [15], by the Poisson distribution. A 
modern development [e.g., 10] assumes that each individual has a 
probability of accident asymptotically equal to pdt in the short time 
interval dt. The number of accidents for each individual during the 
time period (0, ¢) is then a random variable distributed according to 
the Poisson distribution. 

This development assumes the same p for all individuals at all 
times. Practically, such a condition is hard to envision. In life, it 
seems obvious that the probability of an accident varies considerably 
from time to time. In automobile driving, for example, changes in the 
weather, the road surface, competing traffic, alertness of the driver, 
and many other aspects of the environment and the individual seem 
clearly. to change the probability of accident. How is it then that the 
Poisson fits many accident distributions in which complete control of 
such variables has not been and probably cannot be achieved? There 
appear to be three possible reasons: (a) The variations in p may not 
be large enough to affect the distribution noticeably. (b) Another 
hypothesis, more consistent with the circumstances under which 
accident data are usually collected, may generate the Poisson as a model. 
(c) The methods used to test the fit of observed data to the Poisson 
may lack power, so that they do not detect many real deviations from 
it. The validity of reason (a) is highly unlikely in view of the large 
differences which are found between accident rates in, for example, 
flying at night and in the daytime [28]. Reasons (b) and (c) will be 
discussed in turn. 

Koopman [20] has shown that the necessary and sufficient conditions 
for the Poisson distribution are that 


lim m(n) = m 
where m(n) is the expected number of successes (accidents) in the nth 
set of an infinite sequence of sets of n independent trials; and m is the 
expected value, or the mean, of s, the number of successes in the nth 
set of trials, in a Poisson distribution; and that 


lim max p,,, = 0 


where p,,, is the probability of a success on the kth trial in the nth set. 
In practice, these conditions would be considered reasonably well 
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satisfied if there are sufficient cases so that the function governing the 
probability of occurrence of accidents may be considered continuous, 
and if the probabilities of accidents on all occasions are quite small, 
but not necessarily equal. Koopman goes on to show that the same 
conditions apply “when the probabilities p,,, keep in a pre-assigned 
ratio independent of m” [20, p. 813]; and cites examples in which “the 
Poisson distribution is obtained with such unequal p,,, that 


Max 
l<k<n l<k<n 


becomes infinite as any given constant power of n” [20, pp. 813-814]. 
Koopman’s results have been supported and extended by Walsh [29] 
to certain cases in which each event (accident) is not required to be 
independent of all the other events. 

Thus, there is clearly one hypothesis, and probably a large number 
of hypotheses, which give rise to the Poisson distribution under the 
assumption that the momentary probability is not constant. This 
kind of hypothesis seems consistent with the common observation that 
environmental hazards do vary. Unfortunately, it is also consistent 
with an assumption of variable probabilities among or within individuals. 
If a Poisson distribution is observed, there may be equal susceptibility 
among individuals; but fitting a Poisson is not sufficient, or perhaps 
even necessary, to prove equal susceptibility. 

However, it may still be true that in practice the kind of observed 
data which are well described by a Poisson are those which cannot 
be predicted by measures of the individual. This is a matter for 
empirical test. 

To determine whether empirical data are reasonably described by a 
Poisson distribution, it is necessary to use some statistical test of the 
fit. If the test cannot detect wide differences from the expected Poisson 
values, it may be merely as an artifact of the inability of the test to 
discriminate that investigators have found the Poisson to fit their 
accident data. 

There are two classical tests of fit to the Poisson distribution. One 
is the ordinary chi-square test of goodness of fit. The other depends 
on the fact that in a Poisson distribution the mean is equal to the 
variance. The criticism of these tests by Fisher [11] is concise but 
comprehensive, and will be followed here. In Fisher’s notation, S, 
indicates summation from zero to r; r is a number of accidents; a, is 
the number of individuals who incur r accidents. He defines two 
numbers, 

A = S(a,) and B= S(ra,). 


Then B/A = 7, a sufficient estimate of m, the Poisson parameter. 
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“Tf # is small and the series short, evidence of deviation may come 
chiefly or wholly from frequencies with low expectation, so that the 
measure of general discrepancy 


x = S(a, — m,)’/m, 


will have a sampling distribution not accurately given by the usual 
table of x’. Equally, the special test for discrepancy of the variance, 
the index of dispersion 


x = Sla,r — 


for A — 1 degrees of freedom, will be suspect owing to the low estimated 
expectation, 7, in each of the A cells” [11, p. 17]. 

In the general chi-square test, it is customary to combine cells, so 
that the expected values are all larger than some arbitrary value, 
usually ten [9, p. 420]. But this procedure reduces the number of 
degrees of freedom and tends to represent the data in so few categories 
that only gross deviations can be detected. 

Fisher has proposed a new test, to be used “when expectations are 
liable to be very small’ [11, p. 22]: 


a, 
m, 


x’/2 = Sa, log 


where the logarithms are taken to the base e. The right-hand member 
“is essentially the logarithmic difference in likelihood between the most 
likely Poisson Series and the most likely theoretical series without 
restriction” [11, p. 24]. In this test also, it is necessary that the expected 
values, m, , should be reasonably large. However, the test appears 
in Fisher’s example to be more sensitive than either of the other tests, 
so that it may be useful in detecting deviations which would be missed 
otherwise. 

In a study of accidents of Air Force fighter pilots [12, 13], it was 
found that Fisher’s procedure did indicate a significant difference from 
the Poisson at the 1% level of confidence when the standard procedure 
did not, in one distribution out of eight. The two procedures agreed 
in indicating significant results in two distributions. Neither procedure 
indicated significance in the remaining distributions. It would seem 
desirable to have further applications of the Fisher procedure. 

The Poisson distribution is not logically above question as a model 
of “chance” in the occurrence of accidents to individuals; and the 
precision with which a given distribution can be categorized as Poisson 
or non-Poisson is not very satisfactory. It would seem that there 
should be some hesitancy about using the Poisson distribution to 
detect individual differences in accident susceptibility. 
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Application of the Poisson model. The common application of the 
Poisson model to accident statistics has been indicated at several points 
in the preceding section. Following Greenwood and Yule [15], it has 
been customary to assume that a good fit of observed data to the 
Poisson indicates ‘‘uncomplicated chance;’’ and that a poor fit indicates 
the existence of systematic ‘“‘non-chance”’ factors. 

Another way of using the Poisson model was developed originally 
by Newbold [26] and later by Cobb [8]. This is essentially an analysis 
of variance, depending upon the equality of the mean and variance 
in the Poisson distribution. It is assumed that the sample, whether 
or not it follows the Poisson as a whole, may be divided into a number 
of sub-classes each of which is distributed in the Poisson manner. Then, 
the within-groups variance may be estimated without actual knowledge 
of the composition of the sub-classes. The weighted average of the 
sub-class variances will be the same as the weighted average of their 
means: the mean of the whole group. The total variance is, of course, 
known; hence the between-groups variance may be obtained by 
subtraction. Thus, the within-groups variance is the total mean; 
and the between-groups variance is the difference between the total 
variance and the total mean. 

Cobb then concerns himself with the regression line along the means 
of the sub-classes. The weighted mean of the variances of the sub- 
classes (i.e., the within-groups variance) may also be interpreted as 
the variance from that regression. The variance due to regression is 
the between-groups variance. Then, the currelation which can be 
obtained between any predictor and the accident data is limited to 
the extent that regression accounts for the total variance. The maxi- 
mum correlation is the same as the correlation between a true and a 
fallible measure of the same thing, as given, for example, by Kelley 
[18, p. 409]. In this case, 


uy" 


where V is the variance and M is the mean. It is apparent that, in a 
Poisson distribution, the value of this theoretical coefficient will be 
zero; it will rise slowly as the variance exceeds the mean. Occasionally, 
the mean exceeds the variance; but this is usually interpreted as only 
a sampling variation and r is considered to be zero. 

Mintz and Blum [25] have taken a slightly different approach. 
They simply express the difference between variance and mean as a 
percentage of the total variance. This is equivalent to squaring the 
above equation; the interpretation is that the result expresses the 
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maximum percentage of variance which may theoretically be predicted. 

Basic to each of these procedures is the assumption that a series of 
Poisson distributions is present. Whether this is so cannot be established 
empirically, for the theory does not state any means of identifying the 
members of the sub-classes. An empirical rebuttal of the theory would 
be available if a case (or, preferably, cases) could be found in which a 
variable correlates higher than the theoretical maximum with a par- 
ticular kind of accident data. In Thorndike’s thorough review of the 
literature [28], no such case was found. However, (a) in most instances 
the data in the literature were not reported in such a way that the 
theoretical maximum could be calculated, and (b) most studies have 
not controlled exposure to risk in such a way that substantial prediction 
would be expected [28, p. 36 ff.]. 

These procedures translate the findings of Poisson fit into terms 
which are potentially useful to the investigator who wants to know 
what order of prediction he may expect to be able to make. But he 
must first assume that there are Poisson distributions hidden within 
his data. 

Models of ‘‘other-than-chance”’ distributions. If it appears that the 
hypothesis of equal susceptibility among all individuals at all times 
does not hold, it is reasonable to seek an alternative hypothesis. There 
are essentially only two important univariate hypotheses available. 
These were proposed more than 30 years ago by Greenwood and Woods 
[14] and Greenwood and Yule [15]. Building on the work of Pearson 
on multiple cancer deaths, these authors considered two kinds of possible 
theoretical distributions to fit their data: (a) ‘an uncomplicated chance 
distribution,” the Poisson, in which it is presumed that the distributing 
factors are independent of the previous history and character of the 
workers; and (b) “‘a modified chance distribution,” in which the popu- 
lation becomes differentiated into ‘uncomplicated chance” sub-groups. 
A modified chance distribution may take two forms In what will here 
be referred to as Case I, the numbers in the sub-groups undergo con- 
tinuous modification from the beginning to the end of the period of 
observation. In Case II, “the population is ab-initio divisible into 
sub-groups for each of which the chance is different but constant 
throughout” [15, p. 256]. Of these two “modified chance” cases, Case 
I was considered to correspond to the hypothesis of equal original 
accident susceptibility which would be modified by the occurrence of 
each accident. Case II corresponds to the hypothesis of varying 
susceptibilities among the subjects, but constant susceptibility for each 
subject. Kerrich [19] and Bates and Neyman [3, 4] have recently 
extended and refined models for the two cases. 
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In Case II, they have followed closely the assumptions due chiefly 
to Newbold [27]. It is presumed that the over-all distribution of 
accidents is a compound formed from a number of sub-distributions, 
each of which is a Poisson, and that the parameters of these Poissons 
are distributed in the manner of a Pearson Type III curve. The use of 
the Pearson Type III curve “‘is justified both by its flexibility as an 
interpolation formula and by the tradition established by Greenwood, 
Yule, and Newbold” [8, p. 217]. The resulting model is a negative 
binomial distribution. 

In Case I, it is assumed that all members of the population have 
equal initial probabilities of accident, and that these probabilities 
remain constant except when an individual incurs an accident. Kerrich 
then assumes that the probability changes only after the first accident, 
and that the change is always a decrease in the probability of incurring 
further accidents [19, p. 410]. He applies methods developed in the 
study of “contagious” distributions and arrives at a distribution law 
in the form of a negative binomial. Although the components of this 
negative binomial are slightly different from those derived under the 
Case II assumptions, Kerrich states that, when the same methods are 
used for fitting the two distributions to observed data, ‘identical 
numerical results follow” [19, p. 412]. That is, the two dissimilar hypo- 
theses lead to the same mathematical model; and statistical tests will 
lead to exactly the same conclusion about one hypothesis as they do 
about the other. 

Bates and Neyman developed a more general and comprehensive 
pair of models. They also found that the two cases cannot be dif- 
ferentiated in the univariate case: “It is seen that, if the observations 
are limited to one period only, e.g., from 7', to T, , then the distributions 
implied by the two models are single-variate negative binomials with 
two parameters each and are indistinguishable” [4, p. 269]. A similar 
and more general conclusion is stated by Feller: ‘It seems that none 
of the distributions which are now used in the literature permits the 
conclusion that the phenomenon described is contagious” [10, p. 414]. 

In the univariate distribution, the hypotheses of unequal initial 
susceptibility (Case II) and of changes in susceptibility as a result of 
accidents (Case I) produced the same model. Hence, the “non-chance”’ 
hypotheses cannot be differentiated. The available tests, in any event, 
are not so powerful as to be able to distinguish accurately between the 
Poisson (‘‘chance’”’) and the negative binomial (“non-chance’’) dis- 
tributions. 

There are many other ‘‘non-chance’’ hypotheses which might be 
developed. Many of these would no doubt lead to univariate dis- 
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tributions other than the negative bionomial. Some hints of the types 
of distributions which might be useful have been provided by Newbold 
and others. However, only the negative binomial has been investigated. 
In the univariate case, the negative binomial appears to have little 
promise as a means of testing hypotheses about accident proneness. 


THE BIVARIATE APPROACH 


Bivariate models. Following Newbold [26, 27], a number of in- 
vestigators have divided their accident data into two successive time 
periods. The rationale for such a procedure was a practical one: if 
the same people who have accidents in the first period also have them 
in the second, then ‘“‘accident proneness’”’ was present and it would 
have been possible to reduce the accident rate in the second period by 
eliminating the individuals who had high accident rates in the first 
period. 

The formal models which have been proposed for the bivariate 
cases are parallel to those which have been discussed for the univariate 
case. The “chance” model is again the Poisson, although few authors 
have said so explicitly. The presentation by Kerrich [19] is most 
useful. He defines what he calls a “homogeneous” population with 
respect to accidents, by means of two equations. The first equation 
indicates that the probabilities of various numbers of accidents are 
independent of one another. The second equation establishes for each 
time period a distribution law which leads directly to the Poisson model. 

A homogeneous population whose accidents are plotted for two 
periods of time should produce a bivariate Poisson distribution. (It 
is not necessary that the parameters of the marginal Poisson distri- 
butions be equal.) As Maritz [22] has pointed out, it is possible to 
construct a bivariate Poisson distribution in which there is a correlation 
between the two marginal Poissons. Whether such a set of circum- 
- stances is at all likely in practice has been questioned by Blum and 
Mintz [5]. In any event, it is necessary for rigor that Kerrich further 
stipulate that the (population) correlation shall be zero. This he does 
in his first equation. 

It appears that no other set of hypotheses has been postulated which 
leads to the bivariate Poisson model envisioned by Kerrich. This, of 
course, does not mean that there is no such set. It is not unreasonable 
to suppose that, if the two univariate marginal Poissons can be generated 
by “non-chance” hypotheses, then the bivariate Poisson can also be 
so generated. The risk here is common to all cases of inductive inference. 

Our Case II model becomes, in the bivariate case, a bivariate negative 
binomial. Building upon essentially the same assumptions which 
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originally led Greenwood and Yule [15] to the negative binomial model, 
both Kerrich [19] and Bates and Neyman [3, 4] arrive at a bivariate 
negative binomial. As developed by Kerrich, the equation is a com- 
pound of a number of Poisson distributions, whose parameters are 
distributed in the manner of a Pearson Type III curve. Similarly, 
Kerrich’s bivariate “contagious” distribution, our Case I, results in a 
negative binomial with slightly different parameters. This result is also 
directly analogous to the result for the univariate case, and was de- 
veloped under essentially the same assumptions, except that the 
occurrence of an accident was assumed to increase rather than decrease 
the probability of further accidents. 

The presumed advantage of extending these two models to the 
bivariate case is that they can now be discriminated. Kerrich does not 
consider whether they may in certain cases be equal. Bates and Neyman 
[4, p. 269] state the necessary and sufficient condition for equality of 
their more general, but otherwise analogous, models: 


where X is the Poisson parameter, » is a constant associated with the 
number of accidents incurred prior to the beginning of the period under 
consideration and y is a constant associated with the time at the be- 
ginning of the period. That this condition will be fulfilled is, Bates 
and Neyman indicate, highly unlikely. However, it seems somewhat 
more likely that the condition may often be almost fulfilled, so that 
only a very powerful test will detect the inequality. Bates and Neyman 
appear to indicate this when, in a reference to a test (which they do not 
describe) for differentiating between the two distributions, they say: 
“... the authors anticipate that the power of the test contemplated 
will not be a very satisfactory one” [4, p. 272]. 

An example given by Kerrich is of interest in this regard. He fits 
his two ‘‘non-chance” distributions to a bivariate distribution of 
accidents of 122 experienced shunters during an 1l-year period. The 
fits are then tested by chi-square. In neither case can the hypothesis 
be rejected that the data arise from the population hypothesized; the 
probability values are .40 and .59 for the marginal Case I distributions 
and .15 for both marginal Case II distributions. Kerrich chooses the 
‘unequal liability” distribution, our Case II, since its values “fit the 
data rather better’’ [19, p. 423]. The bivariate Poisson may be rejected 
with some confidence, since the probability of a fit is less than .001. 

In another study [12], the accidents of eight groups of fighter pilots 
were divided into two distributions representing odd and even quarters 
of the year. The groups overlapped to a great extent; that is, most of 
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the fighter pilots were members of several of the groups.” The probabili- 
ties for a fit by the two ‘‘non-chance”’ distributions and by the bivariate 
Poisson are shown in Table 1. The methods given by Kerrich [19] 


TABLE 1 


Probabilities that Three Mathematical Models Fit 
Distributions of Fighter Pilot Accidents 


Model 
Distribution “Non-chance” ‘Non-chance”’ “Chance” 
Case I* Case LI** (Bivariate 
Poisson) 
1 .03 .04 .001 
2 .003 
3 <.001 <.001 .01 
4 <.001 35) .03 
5 .15 .36 <.001 
6 07 .003 
7 .39 .29 .001 
8 .09 <.001 ll 


* Equal initial susceptibility, modified by each accident occurrence. 
** Unequal initial susceptibility, constant for each subject. 
*** Not applicable, because the mean of this empirical distribution is greater than the variance. 


were used in calculating the values in this table. An inspection of the 
table reveals an unusual set of contradictions. If the 1% level of 
confidence is used to detect significant results: 

In 4 out of 7 applicable cases, both ‘‘non-chance’’ distributions fit 
the data. 

In 1 case, none of the three distributions fits the data. 

In 1 case, both the Poisson and the Case I distribution fit the data. 

If these examples are at ail typical, it seems that the ‘chance” 
model may not always be differentiated from the ‘“‘non-chance” models, 


2In this, as in many other studies, the groups used were not as homogeneous with respect to 
accident risk ag would be desired. This lack of homogeneity may well account in some measure for 
the negative results obtained. However, the argument here is not concerned with the positive or 
negative results of this study, but rather with the matter of inconsistency of conclusions based on 
different methods of analysis. There seems no reason to suppose that the inconsistencies would 
disappear in a more homogeneous group. It is, in any event, difficult to find a highly homogeneous 
group of such a size as to support statistical analyses of the sort under discussion. 
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and that it will usually not be possible to discriminate between the two 
“non-chance” models. If this is the case, there appears to be limited 
benefit from extension of the models to the bivariate situation. 

The correlational approach. .A somewhat different approach using 
bivariate distributions is that of correlation between numbers of 
accidents in different time periods. It is customary to use successive 
time periods, but this is not necessary. 

This procedure has been vigorously defended by Maritz [22]. He 
shows that two univariate Poisson distributions of accidents in the 
same sample may be correlated and that two similar negative binomials 
may be uncorrelated. He therefore rejects a curve fitting approach 
and advances correlation between successive periods as the only valid 
approach. His reason: “This statistical technique is nearest to the 
psychological definition of accident proneness” [22, p. 434]. 

There seem to be two ways of justifying the use of the correlational 
approach: 

(a) It is relevant in establishing whether or not the occurrence of 
accidents at one time is independent of their occurrence at another time. 
It may be remembered that Kerrich posits independence as a condition 
for a “homogeneous” population. This directly implies zero correlation. 
However, for Kerrich, the converse does not hold; zero correlation does 
not imply “pure chance.” He specifies another set of conditions for 
“pure chance;” i.e., the conditions traditionally assumed to underly 
the Poisson distribution. Hence, from this point of view, the correl- 
ational approach does not provide an unambiguous answer. 

(b) The correlational approach establishes the extent to which the 
occurrence of accidents is consistent for individuals. Accident prone- 
ness is presumably a stable characteristic of the individual. If an 
individual has several accidents in one period of time and none in 
another comparable period, he should not be labelled ‘‘accident prone.” 
If the sample as a whole behaves in this way, it will presumably not be 
possible to predict individual susceptibility to accidents. To put the 
matter in other terms, the correlation coefficient measures the reliability 
of the accident criterion. Since a criterion cannot theoretically correlate 
with any measure to an extent greater than the square root of its 
reliability [18, p. 409], it is useless to attempt to predict susceptibility 
unless the correlation between accidents in two periods is significantly 
greater than zero. 

This second justification is that commonly applicable in psychological 
testing, and its general validity is not questioned. However, it is 
believed that it may not be useful in the case of accident data. 

For one thing, it is based upon normal curve theory. This may, of 
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course, be of little practical importance when the coefficient is either 
very high or very low; but, for a major part of the range of correlations, 
the interpretation of the correlation coefficient is difficult. Maritz 
attempts to avoid this difficulty by interpreting the correlation co- 
efficient as “a sloping constant of the regression lines’ [22, p. 434]. 
He then proceeds implicitly to interpret accident correlations in the 
same way that normal curve theory correlations are interpreted. (In- 
cidentally, Maritz demonstrates the linearity of regression on the 
assumption that the marginal distributions are Poissons, an assumption 
which he otherwise rejects as a useful one.) 

Burke [6] has proposed a chi-square test of significance of the de- 
viation of such a correlation coefficient from zero. The test is made 
independently of the correlation coefficient, and is offered only as a test 
of significance. The degree of relationship may, according to Burke, 
be inferred from the correlation if the chi-square test has shown the 
relationship to be significant. It is still true, however, that a coefficient 
of this kind should not be interpreted as indicating the same degree of 
relationship as an equal coefficient obtained when normal curve theory 
applies. A more serious difficulty with Burke’s procedure is that it 
assumes that the marginal distributions are Poissons. If the marginal 
distributions (in the population) are in fact not Poissons, the test is 
meaningless. If the marginal distributions are Poissons, the test is 
superfluous. Burke’s test uses essentially the same model and produces 
the same arithmetic results as Kerrich’s test of “homogeneity.” But 
the logic which Burke uses seems to be internally contradictory, since 
he supports Maritz’s rejection of the Poisson as a model by the ex- 
pedient of using the Poisson as a model. 

Blum and Mintz have discussed a related point: 

“The correlation between accident records in two periods can 
always be computed from the variances in those two periods and in the 
total period, according to the formula 


Vise Vi V2 
2V ViV2 


- inasmuch as every observation period has an early stage when the 
variance is smaller than the mean, a distribution cannot have a variance 
greater than the mean unless there were positive correlations between 
successive periods somewhere in the past ”’ [5, p. 417]. 

This may be compared with a quotation from Maritz: 

“It, appears, therefore, that by taking a long enough period of 
observation we should eventually find such differences between mean 
and variance as should be indicative of accident proneness, although 
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it may not be sufficient evidence. However, the correlational approach 
is to be preferred since in practice we can say very rarely a priori how 
long such a period should be” [22, p. 437]. 

In view of the point made by Mintz and Blum, it is apparent that 
a lengthy period of observation favors not only divergence from 
Poissonian form, but a high correlation also. The correlational approach 
is not independent of the curve fitting approach, and cannot be justified 
without at least a partial acceptance of the curve fitting assumptions. 

It appears that, under certain circumstances, the correlation between 
periods will increase merely as an artifact accompanying increase in the 
length of the total observation period. Whether this effect is likely to 
have practical significance would be difficult to determine, since an 
increase in the observation period would be expected to produce in- 
creased reliability simply because of the concurrent increase in the 
number of cases, or accidents. 

A final question should be asked about the use of correlational 
methods in the detection of individual differences in accident sus- 
ceptibility: What correlation would be expected on the hypothesis 
of “chance?” Webb and Jones [31] and Burke [7] have shown that 
the expected correlation is not always zero. Consider a group of four 
individuals who have two accidents each. If these accidents are dis- 
tributed at random between the two periods (in which we may, for 
simplicity, assume that the probabilities of accident are equal) the 
expectation is that one will have both his accidents in the first period, 
one will have both in the second period, and the other two will have 
one accident in each period. On the basis of such reasoning, Webb 
and Jones build up their “binomially-partitioned” distribution. And 
it turns out that the correlation coefficient in their example is not zero. 
In fact, they and Burke demonstrate that if it is corrected by the 
Spearman-Brown step-up formula, it is equal to Newbold’s estimate 
of the maximum correlation possible with accidents over the whole 
period, as given above. If the over-all distribution for the combined 
periods is a Poisson (or, more generally, if it is such that its mean and 
variance are equal), then the correlation is zero. But the assumption of 
a Poisson distribution begs the question (and no other suitable dis- 
tribution has been proposed). If such an assumption is necessary, 
there is no unique value in computing the correlation coefficient. The 
same conclusion can be reached by univariate curve fitting methods. 
The conclusion is subject to all the objections which have been raised 
to those methods. 


TIME INTERVALS BETWEEN SUCCESSIVE ACCIDENTS 


Rather than study distributions of numbers of accidents occurring 
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over a period of time, it is possible to study distributions of the times 
between successive accidents. An advantage of this procedure, as 
pointed out by Maguire, Pearson, and Wynn [21], is that an exact 
distribution function (a negative exponential) can be posited for the 
chance distribution of such intervals. Maguire, Pearson, and Wynn 
illustrate several applications of the procedure, primarily aimed at 
determining simply whether a given distribution is departing from 
chance expectancy or from some other observed distribution. 

Horn [16] presented data to show that adjusted aircraft accident 
rates increased sharply soon after an individual incurred an accident. 
He interpreted this finding to indicate a “shock”’ effect of an accident, 
which might be countered through rest or training after each accident. 
Mintz [23, 24] has criticized Horn for failing to compare his distribution 
with the ‘‘chance” exponential. Nevertheless, some of Horn’s data are 
impressive. For example, he shows that accidents attributed to pilot 
error are more frequent relative to non-pilot-error accidents in the 
early months after a first accident than they are later, regardless of 
the nature of the first accident. It is possible that Horn’s results are 
merely a reflection of the fact that Air Force pilots and units are re- 
assigned frequently; thus, both first and succeeding accidents tend to 
occur within a period (which may often be short) when a pilot has a 
relatively hazardous assignment. 

It would seem that Horn’s findings should be checked, as a test not 
only of his conclusions but also of the usefulness of the exponential 
model. 

Mintz [23, 24] has investigated two ways of analyzing accidents of 
178 taxi drivers over a period of one year. His aim was to determine 
whether the data, which were clearly not fitted by a Poisson, could 
best be explained by (a) the hypothesis of unequal but constant sus- 
ceptibilities among the various drivers, or (b) the hypothesis of 
“contagion;” i.e., that the occurrence of an accident increased the 
likelihood of further accidents on the part of each driver. Mintz did 
not consider the hypothesis that accident occurrence might decrease 
the likelihood of further accidents, but apparently could have done 
so within the framework he used. 

One way Mintz proceeded [24] was to compute, on the basis of 
previous work on the random distribution of events within an interval 
[e.g., 21], the expected frequency of occurrence of various groups of 
time intervals. He then compared these frequencies with those observed 
for the taxi drivers. Agreement of the frequency distributions was 
fairly good, indicating in Mintz’s view a preference for the hypothesis 
of constant unequal susceptibilities. 

Mintz’s other procedure [23] was to list average intervals between 
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accidents and inspect. these figures for correspondence with the two 
hypotheses. His conclusions were based primarily on a comparison 
of mean intervals between successive accidents (the mean intervals 
being computed separately between first and second, second and third, 
third and fourth, etc., accidents) and mean intervals of the same drivers 
before the first accident (or, more precisely, since these were experienced 
drivers, between the beginning of the year and the occurrence of the 
first accident in that year). It is of course to be expected that the 
means in both cases would decrease as the number of accidents increases 
(e.g., the intervals before and between accidents must obviously on 
the average be longer for a driver with two accidents than for a driver 
with three accidents in the year). However, if accidents tend to in- 
crease proneness, argues Mintz, the decrease should be less for the 
intervals before the first accident than for those between successive 
accidents. Since the rates of decrease were about the same, Mintz 
concluded again that the alternate hypothesis of constant unequal 
susceptibilities was supported. 

Bates and Neyman [4] have presented a tentative extension of their 
model to the description of time intervals between accidents. Since 
their model includes explicit parameters representing susceptibility, 
“contagion,” and “time effect’’ (i.e., the decrease of susceptibility 
with accident occurrence), they are able to deduce the form of a 
particular probability density function when contagion or the time 
effect or both are absent. Surprisingly, when both these effects are 
absent (i.e., when susceptibilities are constant), their model implies 
that time intervals “for arbitrary individuals of the population will be 
uniformly distributed between zero and unity” [4, p. 274]. This seems 
to imply agreement with Horn’s procedures, and hence disagreement 
with those of Mintz. However, Bates and Neyman did not apply their 
procedures to any data or suggest just how it might be done. Hence, 
it is not certain whether such an implication was intended. 

A later article by Bates [2] extends and makes this model somewhat 
more explicit, for the restricted purpose of distinguishing between 
“chance” and ‘“‘contagious” hypotheses. The “contagious” hypothesis 
now includes only linear contagion, but allows for both positive and 
negative types of contagion (increase and decrease in susceptibility 
after each accident). The basic statistic was found to be the grand 
mean of time intervals, counting from the beginning of the period of 
observation to the occurrence of the last accident of each individual. 
Bates showed that her proposed test of these means has considerable 
power, but did not apply it to any data. 

In general, it would seem that more applications of these types of 
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procedures should be made. Until this is done, evaluation of the idea 
is perhaps premature. However, it seems appropriate to point out 
two general types of difficulties which should be attended to in further 
studies. 

First, there may be further alternate explanations of any deviations 
from expectation which may be found in distributions of time intervals 
between accidents. One of these explanations, as Mintz points out, 
is that the effects noted are simply the result of seasonal variations. 
The procedures, as they have been described, assume constancy of 
risk throughout the time interval studied.* This assumption is a 
questionable one for most empirical accident distributions. Some check 
or correction for equation of risk should be incorporated in future 
studies. 

The second difficulty is a more fundamental problem in the logic 
of the method. Given an observed number of accidents for each indi- 
vidual within a fixed over-all time period, does it logically make any 
difference when the accidents happened? If the intervals between 
accidents are generally short, then there must be a relatively long 
average time between the last accident and the end of the period. By 
the logic of the time interval method, this must mean that proneness 
increased up to the last accident and then decreased. If so, would 
there then have been any net increase in proneness as a result of previous 
accidents within the period? The answer, on the face of the matter, 
would seem to be no. Proponents of this method, it would seem, have 
an obligation to clarify this matter of logic. Until they do so, a certain 
amount of skepticism would appear to be justified. 


ISSUES IN FURTHER RESEARCH 


The methods which have been used to detect individual differences 
in accident susceptibility have a number of logical and practical de- 
ficiencies. Perhaps the general hypothesis of accident proneness is 
not such a fruitful one as to warrant further development of methods 
for its detection. If, however, as seems to be the case, research con- 
tinues on this topic, there would appear to be a pressing need for further 
development of methodology. Such further development could take 
place in at least three different ways. 

1. Improvement of present methods. Improvement is needed both in 
the logical foundations of the methods and in their statistical properties, 


3Most of the other methods (with the primary exception of the correlational method, when the 
odd and even periods are chosen in such a way as to cancel out seasonal effects) make the same as- 
sumption. However, such an assumption becomes much more crucial in the case of the time interval 
method, since everything depends on the times when accidents occurred. 
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particularly the powers of the tests involved. The work of Bates and 
Neyman [2, 3, 4] has provided some encouragement that substantial 
improvements in both respects can be made. 

2. Development of new methods. Many hypotheses about accident 
susceptibility have never been pursued or even stated. If adequate 
models can be developed for several new hypotheses (and if tests can 
be developed to differentiate these models from other models), a good ' 
deal of vigor would be added to this area of research. It may also be 
that improved statements and models for old hypotheses can be de- 
veloped. 

3. Improvement in available accident data. Although this paper has 
been concerned only with methods for treating data, it would be remiss 
if attention were not called also to deficiencies in the data. The work 
of Warren and his associates [30] and the suggestions of Thorndike 
[28] are perhaps most helpful in this regard. 

New ideas and creative research are needed in this area. At the 
same time, careful attention to the logical and statistical properties of 
the methods used and to the nature of the data to which the methods 
are applied will be required if further research on individual differences 
in accident susceptibility is to make progress. 


SUMMARY 


Most accident distributions look a lot like the Poisson curve. They 
also look like the negative binomial. It is hard to differentiate between 
these two theoretical distributions; and each of them may, in any 
event, be generated by at least two conflicting hypotheses. Univariate 
study of accident distributions leads almost inevitably to inconclusive 
results. 

If bivariate methods (involving two or more time periods) are used, 
the situation is only a little better. Bivariate Poissons and negative 
binomials share the major disadvantages of their univariate analogues. 
Correlational methods imply the Poisson model, while disavowing it 
explicitly. Conclusions from bivariate study of accidents are con- 
flicting and unclear. 

Investigations of time periods between successive accidents have 
produced some promising, but highly tentative, results. The logical 
premises of this type of investigation are still somewhat ambiguous. 

If it is assumed that individual differences in accident susceptibility 
constitute a fruitful area of research, improvements are needed in the 
available methods, in the development of new methods, and in the 
availability of adequate data to which the methods may be applied. 
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PRECISION OF ESTIMATES OF VARIANCE COMPONENTS’ 


THERESE KEeLLEHER, H. F. Roprnson anv R. E. Comstock 


North Carolina State College 
Raleigh, North Carolina - 


During the course of investigations of quantitative inheritance in 
corn at the North Carolina Experiment Station, a large body of data 
has been collected, suitable for a study of sampling variability of 
estimates of components of variance. Such components have been 
found-useful in separating the heritable portion of the variation from 
that associated with factors of environment, and in planning experi- 
ments. 

The particular issues to be examined in these data are the assump- 
tions of normality and homogeneity of variances, which are made when 
one evaluates the precision of estimates of components. 

A preliminary study was conducted by Comstock and Robinson 
[1951]. Although the methods used in their study did not make possible 
any conclusive statements regarding the particular assumptions of 
homogeneity of variances and normality, it was concluded that generally 
satisfactory agreements existed between actual sample estimates of 
variance components and the estimates computed on the basis of these 
assumptions. Considerably more material is now available for a more 
extensive study. 


DESCRIPTION OF MATERIAL 


Details of the experimental procedures used in the investigations 
from which these data were taken have been given in reports by 
Robinson, et al. [1949] and Robinson, et al. [1955]. 

The plant material for a unit of an experiment consists of a set of 
16 full-sib families. The 16 families are made up in 4 groups with 4 
families per group. In any one group the male parent is the same for 
all families, but no two groups have the same male parent. All 16 
families have different female parents. A unit is a two-replicate random- 


1Contribution from Institute of Statistics of the University of North Carolina Agricultural Ex- 
periment Station. Journal Paper No. 813. 
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ized block comparison of the 16 families of such a set. The units of a 
population grown in a single year constitute an experiment. 


The analysis of variance of a unit of an experiment is shown in 
Table 1. 


TABLE 1 
ANALysIS OF VARIANCE OF A UNIT OF AN EXPERIMENT 
Source d.f. E(MS) 
Total 31 
Replications 1 
Males 3 ofr + 4omr + 207 + 80m 
Females in males 12 ofr + 20} 
Males X Replications 3 ofr + 4omr 
Females in Males X Reps. 12 oF 


o*y, = component of variance associated with the differential response of full-sib families with 
the same male parent from one replication to another plus the component associated with 
differences among individual plants within plots. 

o*m, = component of variance associated with the differential resp of half-sib families from 
one replication to ther. 

o*; = component of variance associated with means of full-sib families with the same male parent. 

o*, = component of variance associated with means of half-sib families. 


Combined estimates of these components are obtained from the 
several units of an experiment and the several experiments in each 
population. The consistency of estimates obtained in this manner 
was investigated. _ 

Table 2 shows the number of units available, the populations in- 
volved, and the years in which data were collected. 

Two types of genetic populations were included: 


(1) open-pollinated varieties: 
(a) Jarvis 
(b) Weekley 
(c) Indian Chief, 
(2) populations originally developed from the F, generation of 
hybrids between inbred lines: 
(a) CI21 X NC7 
(b) NC34 X NC45. 


Crosses were made in 1949 in all populations except Indian Chief, 
and 256 full-sib families were tested in 1950 in each population. The 
same families were grown again in 1951 in Jarvis, Weekley, and 
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(CI21 X NC7). In each population, a new randomization of the 
families was made within units the second year. A second sample 
of 256 families was obtained in Jarvis and in Weekley, and these families 
were grown in 1952 and 1953. A single sample of 512 families was 
made up in Indian Chief in 1951 and those families were grown in 1952 
and 1953. The (NC34 X NC45) families grown in 1953 were made 
up in 1952 in the population reconstituted from the upper yielding 5 
per cent of the families grown in 1950. 


TABLE 2 
Makeup oF EXPERIMENTAL MATERIAL 


Population Sample Units Reps Males Females Families 

No. Year peryr. perunit perunit per male per year 
NC34 X.NC45 1 1950 16 2 4 4 256 
NC34 X NC45 2 1953 13 2 4 4 208 
CI21 X NC7 1 1950 16 2 4 4 256 
CI21 X NC7 1 1951 16 2 4 4 256 
Jarvis 1 1950 16 2 4 4 256 
Jarvis 1 1951 16 2 4 4 256 
Jarvis 2 1952 16 2 Be 4 256 
Jarvis 2 1953 16 2 4 + 256 
Weekley 1 1950 16 2 4 4 256 
Weekley 1 1951 16 2 4 4 256 
Weekley 2 1952 16 2 4 4 256 
Weekley 2 1953 16 2 4 4 256 
Indian Chief 1 1952 32 2 4 +t 512 
Indian Chief 1 1953 32 2 4 + 512 


TREATMENT OF DATA AND DISCUSSION OF RESULTS 
Degree and Effects of Non-normality 


Among the measures of departure from normality considered by 
Fisher [1930] was a statistic, g. , where 


i.e. it is the standardized 4th k-statistic, 
E(g2) = ¥2 = (K./K3) + 0m"), (2) 


where K, = 4th cumulant and K, = 2nd cumulant. 
The parameter y. is particularly relevant to investigating the 
assumption of normality as it affects the variance of mean squares. 
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If the effects in each classification contributing to a mean square are 
independent and have the same distribution, the variance of the mean 
square is 


V(M) = [2K2/(N — 1)] + K,/N (3) 
= [2K2/(N — 1)] + (4) 

If the effects are normally distributed, 
V(M) = 2K2/(N — 1) (5) 


because y2 is zero in a normal distribution. 

Since estimates of components of variance are linear functions of 
mean squares, investigation of the effect of y. on variances of mean 
squares will provide information regarding variances of estimates of 
components. 

Estimates of y, were obtained in each population for plot yields of 
corn in pounds per plant. Individual g, were computed for each unit 
as shown by Fisher [1950]. The means of the unit estimates were 
computed to give an estimate of y. for each population: 


i=1 
r = no. of units in a population. 


In this procedure, the values of g, obtained are free of main effects of 
years and units. They are, therefore, appropriate values to consider 
in the study of the components of variance since these too are free of 
the main effects of years and units. 

The information obtained in this study regarding the effect of non- 
normality on the variance of mean squares is directly applicable only 
to the total mean square of Table 1, pooled over units and years for 
a population. It is possible that the variance of one or more of the mean 
squares into which the total is partitioned may be affected considerably 
by non-normality of the distribution while the variance of the total 
mean square is not so affected. However, the effects of non-normality 
on the individual mean squares of Table 1 were not investigated. In 
the case of three of the partitions, i.e. replication, replications X males, 
and replications X females in males, the two replications were insuf- 
ficient to estimate fourth degree parameters independently for each 
partition. The four male means for each unit and the four female means 
for each male could be used to estimate a y, for the male and female in 
male, but such estimates would be very inefficient because of the small 
numbers of the two categories. 
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The estimates of y, for the various populations are given in Table 
3. Standard deviations for g, are also shown for each population. The 
standard deviation of an individual g, , when sampling from a normal 
universe, was shown by Fisher [1930] to be: 


24N(N — 1)’ 
s.d. (g2) = Naw — 3)(N — 2)(N + 3)(N + 5)’ as 


N = number of observations in an individual unit. 
The standard deviation of the mean of the g.’s for each population 
was computed as: 


s.d. (g2) = s.d. (g2)/Wr (8) 


TABLE 3 


EsTIMATES OF ¥2 (INTRA-BLocK) IN GENETIC PopULATIONS OF CoRN 
(Yretp Measure) 


No. Units 
Population in G2 Theoretical s.d.(g2) 
CI21 X NC7 32 +0.0347 0.14311 
NC34 X NC45 29 —0.3146 0.15029 
Jarvis 64 +0.0588 0.10117 
Weekley 64 —0.0146 0.10117 
Indian Chief 64 —0.1268 0.10117 
Total 253 —0.0614 0.05088 


Positive values were obtained for g, in two of the five populations, 
(CI21 X NC7) and Jarvis. The remaining three populations yielded 
negative values for this statistic. In one population (VC34 * N(C45), 
the value of g, fell outside +2 s.d.(g.) while in the other four populations 
the values obtained for g, fell within 2 standard deviations of the ex- 
pected value of zero. The apparent departure of the (VC34 X NC45) 
from normality is not considered to be out of line with expectation 
since the comparison in this population is the only one of the five in- 
dependent comparisons for which the deviation fell outside 2 s.d.(g,). 

In general, the criterion used to compare observed and theoretical 
values of g, is affected by skewness, and the sampling distribution of 
g2 from a normal universe is known to be skewed for small samples of 
observations. [Pearson, 1930]. In view of the large number of obser- 
vations and the general agreement among the estimates, no serious 
error is anticipated from this source. 
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Heterogeneity of Variance 


The general procedure used in studying the heterogeneity of the 
components of variance was to carry out analyses of variance of the 
estimates. Because of the randomization pattern within units, a 
separate analysis of each male group was possible. This analysis is 
shown in Table 4. 


TABLE 4 
ANALYSIS OF VARIANCE oF A Group 
Source of variation d.f. E(M.S.) 
Replications 1 
Females 3 ofr + 207 
Females X replications 3 ofr 


Estimates of of and «7, were obtained from the analysis of each male 
group. The male X replication component, o2,, , cannot be estimated 
directly by this technique. This component is considered to be of the 
same order as o;, and information concerning 7, should, in general, 
apply to o%,, also. Clearly, estimates of o%, are not obtainable since 
analyses are run on each male group separately. 

The results for hybrids and variety populations are shown in Tables 
5 and 6. The sources of variation are as outlined in Table 2 for each 
population. 


TABLE 5 
ANALYSIS OF VARIANCE OF 87 AND 8}; FOR HyBrip PopuLATIONS 
CI21 X NC7 NC34 X NC45 
d.f. d.f. m8. 
s} Shr 3} 

Total 127 115 

Years 1 590* 124 1 47 164 
Units in years 30 47 38 27 30 24 
Residual in units 96 56 47 87 37 22 


*All values X (10-6) 
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TABLE 6 
ANALYSIS OF VARIANCE OF 87 AND 87, FOR VARIETY PoPULATIONS 
Jarvis Weekley Indian Chief 
d.f. m.8. d.f. d.f. 

Sr} Sr 
Total 255 255 255 
Samples 1 22* 80 1 79 25 
Years in samples 2 201 17 2 224 38 1 2092 Ad 
Units in years 60 55 30 60 24 20 62 27 12 
Residual in units 192 44 25 | 192 37 23 | 192 22 18 


*All values X (107°) 


Heterogeneity was not detected for either component when sampling 
was within years. These results confirm on a broader basis of material 
the conclusions reached in the previous study. [Comstock and Robinson, 
1951]. 

The comparisons between years show heterogeneity for sj in only 
one population (VC34 X N(C45). In this population, samples of 
different genetic composition were observed in different years and the 
second sample could be different inasmuch as it was reconstituted from 
the top yielding five per cent of the progenies of the first sample. In 
each of the other populations, the samples were randomly chosen and 
no heterogeneity for s; was detected among years. In contrast, differ- 
ences in s;, among years were found in four of the five populations. If 
reference is made to Table 2, it can be seen that in each of these popula- 
tions, 1950 is compared with 1951 and/or 1952 with 1953, whereas for 
(NC34 XX NC45), (the population in which no difference was detected) 
for s;, , 1950 was compared with 1953. It is recognized that the sample 
of years is inadequate for general conclusions. However, it is evident 
that the behavior of the different components is not the same when 
sampling at this level and that further investigations should consider 
the specific nature of each component. 


Comparison of Residual vs. Theoretical Variances 


Following the procedures outlined in Bartlett and Kendall [1946], 
the residual variance, i.e. mean square for residual in units in Table 
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6, for® In s}, was compared to the theoretical variance of the In s’ in a 
normal universe with constant variance. The comparison for each 
population is given in Table 7. The theoretical variance is that for 
N — 1 = 3 in Table I of Bartlett and Kendall (op. cit.). Comparison 
of this value with the estimated residual variance in In s;, constitutes 
a dual check on the previous two procedures for this estimate since 
both normality and homogeneity are included in this test. The com- 
parison is not made for s; since it is a linear function of mean squares. 


TABLE 7 
Comparison OF RESIDUAL FOR LN 8}, WITH THEORETICAL LN 8? 
Populations d.f. For Residual Residual In s7, Theoretical In s7, 
CI21 X NC7 96 0.90162 0.93484 
NC34 X NC45 87 1.4592 0.93484 
Jarvis 192 0.89021 0.93484 
Weekley 192 1.0048 0.93484 
Indian Chief 192 0.73282 0.93484 


From Table 7, it is seen that the deviation between Residual and 
Theoretical variances of In s%, is positive in three populations and is 
negative in two populations. Only one deviation, i.e. the deviation in 
(NC34 X NC45) approaches significance. The result of the comparison 
in this population differs from the results shown in Table 5 for variation 
on the original scale. Bartlett and Kendall [1946] have shown that this 
test exaggerates significance when N is small and it is suggested that the 
apparent significance of the estimate in Table 7 is less than the real 
significance. 


SUMMARY 


This study has been concerned with the effects of non-normality of 
parent distributions and lack of homogeneity of variances on the 
precision of estimates of components of variance when the precision 
has been estimated on the basis of these assumptions. 

Three procedures were used to investigate measurements of plot 
yields in five populations of corn. The procedures were: (1) estimation 
of y. , a fourth degree parameter which enters into variances of com- 
ponents of variance in non-normal population, (2) analysis of variance 
of estimates of two variance components, (3) comparison of residual 


‘In is used to denote a natural logarithm. 
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mean squares of estimates with the theoretical variance in a normal 
universe with common variance. 

Estimates of variance of components of variance were concluded to 
be unaffected generally by non-normality, to the level investigated. 

If sampling is within years, the variability of estimates of com- 
ponents can be calcu’ated on the assumption of common variances in 
the parent distributions. Estimates of one of the two components 
investigated varied more from year to year than would be expected 
on the basis of sampling alone while the estimates of the other component 
did not. The sample of years was too small for full evaluation of these 
results but it was Suggested that the composition of the individual 
components should be considered in any future study of sampling 
behavior. 

The results of the comparisons of residual and theoretical variances 
were in general agreement with those obtained from the other two 
procedures. 
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SAMPLING TECHNIQUE FOR ESTIMATING THE CATCH OF 
SEA FISH IN INDIA 


P. V. SUKHATME 
F.A.O. Rome, Italy 
V. G. Panse anv K. V. R. Sastry 
I.C.A.R., New Delhi, India 


1. Introduction 


Fishing is a principal industry of the villages along the coast of 
India. The equipment used for fishing consists mostly of a boat and 
net. The boats are small in size with the majority between 12 and 
25 feet in length. They are rowed by the fishermen to within small 
distances from the coast, depending upon the direction and velocity 
of the wind, the season, and the availability of fish. The crew consists 
of the fisherman and one or more members of his family or hired men. 
The hours of fishing depend upon the tide and the season of the year, 
but the activity is the greatest during early hours of the day. When 
fish are available in plenty, several trips are made during the day. As 
soon as the boat lands, it is surrounded by the buyers who bid for the 
catch and take it'to the market. Whatever fish are not sold fresh are 
taken to the curing yard for salting or are dried in the sun. 

The catch is landed at all hours each day. Night-fishing is not 
uncommon. The difficulties of making reliable estimates of the catch 
are, therefore, evident. It is the object of this paper to describe the 
sampling technique developed in India for estimating the catch brought 
in by boats and nets. 


2. The Sampling Units 


There is no compulsory registration of boats. The first attempt 
at providing a frame for sampling was, therefore, directed to taking 
a census of the fishing boats at the start of the season in the villages 
along the coast. A sample of boats was selected out of those listed 
at any village and observed for the catch brought during the day. 
The catch so brought, multiplied by the inverse of the sampling fraction, 
was thought to provide an estimate of the total catch for the selected 
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village. It was found, however, that the boats changed hands fre- 
quently during the season and did not always return to the same place 
from which they took off. It was, therefore, impossible to keep track 
of the boats included in the sample. Moreover, choice of a boat for 
observation in advance of the trip aroused the suspicions of the fishermen 
and induced them to land boats deliberately away from their normal 
place of landing and the fishermen whose catch could not be recorded 
directly on landing gave information of doubtful value when interro- 
gated later by the enumerator. The most important finding made 
during this attempt was that the fishermen land their catch not at the 
villages where they live but at landing centres which are convenient 
points along the coast suitable for landing and connected with easy 
communication to the market, road, or rail-head. On the one hundred 
mile coastline of Malabar, along which the present investigation was 
conducted, there are about 96 villages and 61 landing centres. These 
difficulties suggested, among other things, the use of a landing centre 
as a sampling unit in place of a village and the need for an altogether 
different approach for estimating the daily catch at any centre. 

The total catch during a specified time interval at any centre is 
the sum of the catches brought to the centre by all the boats landing 
during that interval. A count of the number of boats landing during 
a specified interval can be made by observation from a vantage point 
at the landing centre, but information on the weight of catch is difficult 
to obtain, because several boats may land simultaneously, and this 
number is large during certain hours, and the catch is disposed of 
almost immediately on landing. It is therefore not possible for an 


TABLE 1 


RELATIVE Variances (WiTHIN Days) or Catcu In Mp. Per Hour (z), NUMBER 
or Boats Lanp1ine Per Hour (z) anp Catcu 1n Mp. Per Lanpine Boat (y)* 


7th Week 22nd Week 
Average Relative Average Relative 
Variance Variance 
2 120.16 3.65 28.86 6.45 
z 13.52 2.30 6.91 2.88 
y 10.45 0.30 4.21 0.52 


*The Table is based on the data recorded during two weeks, the 7th week of the survey com- 
mencing 21 October 1950 and the 22nd week commencing 10 February 1951, at all the 61 landing 
centres on the Malabar Coast. The maund is a unit of weight equal to 82 lbs. approximately. 
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enumerator to record the catch for every boat landing during a specified 
time interval. 

A practicable alternative is to consider the total catch during a 
specified interval as the product of two factors: the number of landings 
during the interval and the average catch per landing boat, and estimate 
the two separately for each time interval included in the sample. Since 
the relative variance for catch per boat is much smaller compared to 
that for hourly count within a day, as can be seen from Table 1, it 
seems feasible to estimate the former from a relatively small sub-sample 
of the boats landing during any specified interval. The regression 
analysis given in Table 2 confirms this inference. Moreover, correlation 
between the number of boats landing hourly and the corresponding 
- average catch per boat is heterogeneous but small (Table 3). Without, 
therefore, appreciably increasing the variance, it would appear feasible 
to divide the specified time interval into two suitable parts, in one of 
which, randomly chosen, observations can be made on the count and 


in the other on the catch on a sub-sample of the boats landing during 
that part of the interval. 


TABLE 2 
REGRESSION ANALYSIS OF THE Hourty Catcu (z), (i) on CorrEsPONDING Count oF Boats 
LaNnpING (zx) AND (ii) ON CoRRESPONDING AVERAGE CatcH Per LANpDING Boar (y) aT 
One Centre EAcs IN 7TH AND 22ND WEEKS 


7th week 22nd week 
Source of |—— 
Variation 
d.f.| S.S. (md?) | m.s. (md?) /hr. F d.f.| S.S. (md?) | m.s. (md?) /hr. F 
Due to re- , 
gressiononz| 1 2227881 1 8542 
Residual 24 562827 23451 g5** 9 6327 703 12** 
Due to re- 
gressionony| 1 959649 1 151 
Residual 24 1831059 76294 13** 9 14718 1635 l ns. 
Total 25 2790708 10 14869 
| **I ndicates significance at the 1 % level; n.s. indicates not significant at the 5% level. 


3. Efficiency of Different Methods of Sampling a Day 


To start with, the investigation into the efficiency of different 

methods of sampling a day was confined to one large centre (Quilandy). 

ad A count made of the number of landings and catch every hour from 6 

: . a.m. to 6 p.m. on seven consecutive Mondays was used to study the 
efficiency of two methods of sampling the day, namely, simple random 


ad 
Aq 


ESTIMATING CATCH OF SEA FISH 81 


TABLE 3 


CorrRELATIONS WiTHIN Days BETWEEN NuMBER oF Hourty LANDINGS AND 
CoRRESPONDING AVERAGE CatTcH PER Boat at DirFERENT Sets or 12 
CENTRES OBSERVED DURING 7TH AND 22ND WEEKS 


Values of the correlation coefficient 
Centre 
7th week 22nd week 
I 0.111 0.212 
II —0.025 0.158 
III 0.187 0.385 
IV —0.397 0.101 
V 0.157 —0.062 
VI —0.289 —0.220 
VII —0.036 —0.447 
VIII —0.690 —0.294 
IX —0.443 0.100 
x 0.036 —0.110 
xI 0.154 0.148 
XII —0.024 0.138 
Pooled —0.044 0.091 
over all 
centres 


sampling and systematic sampling. The latter method offered obvious 
practical advantage in regulating the enumerator’s work and in facili- 
tating supervision over it. On the other hand, the efficiency of the 
systematic sampling relative to simple random sampling depended 
upon the pattern of variation in count and catch between successive 
time intervals. Only a statistical analysis of the data collected over 
different times of the fishing season and at different centres could 
therefore determine the type of sampling best suited for estimating the 
catch during the day. 

Table 4 shows the recorded data for landings. It will be seen that 
there is marked variation from hour to hour during the day. Further, 
the variation is largely systematic exhibiting rather high correlation 
between successive values. 

Table 5 illustrates the method of calculating the efficiency of the 
two methods of sampling. The efficiency has been calculated for 4 
different samples, namely, those of 2, 3, 4, and 6 hours each. The 
table gives the analysis of variance together with the values of the 
intraclass correlation and of the variance of systematic relative to 
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random sampling for the data collected on the 3rd Monday. The latter 
was calculated from: 


ve=(1-)s: (2) 


where V,, and Vz stand for the variance of the hourly count estimated 
from the systematic and random samples of m respectively, 


km = 12, 
S*, = the mean square between hours during the day, and 
p = the intraclass correlation between units (hours) in the sample. 


The intraclass correlation is seen to be negative for all the four syste- 
matic samples. The value of the intraclass correlation approaches 
the maximum that could be attained when the sample size is 6. The 
table shows that systematic sampling is considerably more efficient 
than random sampling for estimating the number of daily count of 
landings. 


TABLE 6 
Vatuses or INTRACLASS COEFFICIENT OF CORRELATION AND OF VARIANCE OF 
Systematic RELATIVE TO RANDOM SAMPLING FOR NUMBERS OF Boats 
LANDING IN SPECIFIED Time INTERVALS 


Size of Values of p 100 Vay/Ve 


Week —_—_—> 2 3 4 6 2 3 4 


1 —.34 —.265 —.26 —.13 73 61 30 64 
2 -—.33 —.36 -—.18 —.16 74 34 63 37 
3 —.16 —.199 49 56 72 1 
4 —.29 -—.29 -—.31 —.190| 7% 51 10 9 
5 —.51 —.48 +.04 —.193] 54 4 154 6 
6 —.79 —.27 —.32 —.197] 23 56 6 3 
7 -—.16 —.13 46 59 72 64 


The above conclusion is confirmed by the values of the intraclass 
correlation and of the relative variance for all the 7 Mondays of the 
7 weeks given in Table 6. 
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Table 7 presents the data on hourly catch for the same days and 
the same centre as those in Table 4, and Table 8 sets out the corres- 
ponding values of the intraclass correlation and of the relative variance. 
Table 8 confirms the conclusion derived from an analysis of landings, 
namely that the systematic sampling is considerably more efficient 
than random sampling for estimating the daily catch. The close 
similarity between the sets of values for catch and landings should be 
noted. It shows that it will suffice to develop the sampling design 
for estimation of catch based on the study of hourly landings. 

With a view to studying the applicability of this conclusion to 
different centres and from a wider angle of the work-load which an 
enumerator can reasonably carry during the day, the hourly data on 
landings were recorded every day for two months, January and April, 
at landing centres randomly selected along the Malabar coast. The 
data were collected for 14 hours from 5 a.m. to 7 p.m. The data for 
the first. and last hours of the day were, however, not taken into account 
in the analysis, as these two hours did not account for much activity 
and the omission made more convenient the study of the relative 
efficiency of the systematic and simple random methods of selecting a 
sample of 2, 3, 4, and 6 hours. The two methods were studied with 
clusters of 1, 2, and 3 consecutive hours respectively as units of sampling 
for observation. The variance was calculated using the formulas (1) 
and (2) with appropriate modifications. Thus, for clusters of 2 hours, 


km = 6, 


1, 2, and 3, 


m 


is? 


the mean square between clusters (per hour basis) within 
days, and 


p = the intraclass correlation between clusters in the sample. 


These values of the variance pooled over all the centres expressed as 
percentages of the variance of the simple random sample of 6 hours 
with an hour as the unit of sampling are given in Table 9 for the two 
months, January and April. 

The first thing to be noticed from the table is that cluster sampling 
is less efficient relative to sampling with an hour as the sampling unit, 
the loss of efficiency increasing with the size of the cluster. This is to 
be expected in view of the high positive intraclass correlation between 
fishing activities in successive hours. Secondly, the variance of syste- 
matic selection relative to simple random sampling with an hour as the 
sampling unit is less and decreases with the size of the sample, thereby 
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confirming the superiority of systematic sampling of hours over simple 
random selection. A systematic selection in clusters of 2 hours is less 
efficient compared to systematic selection of hours, although still 
superior to simple random selection. Systematic selection of clusters 
of 3 hours is much less efficient compared to corresponding selection of 
clusters of two and one hours and also inferior to simple random selec- 
tion of hours. Considered from the viewpoint of precision alone, 
systematic selection of hours would, thus, appear to be the best method 
of sampling a day for estimating the number of landing boats. There 
are, however, other considerations to be kept in view. Thus, every 
visit to the coast involves time and labor for the journey. Assuming 
that it takes half an hour to reach the coast and an equal time to return 
home, a sample of six different visits would mean work for 12 hours a 
day, which is obviously longer than what practical considerations 
would allow. A reasonable work-load for an enumerator would be 
between 6 and 8 hours. Schemes of work satisfying this criterion are 
(i) four visits of 1 hour each, involving a work-load cf 8 hours; (ii) 
three visits of 1 hour each, involving a work-load of 6 hours; (iii) two 
visits of 2 hours each involving 6 hours of work-load, and (iv) two 
visits of 3 hours each involving 8 hours of work-load. Of these, the 
first scheme would not be practicable since as many as four trips during 
the day would not leave sufficient time for rest at home for the enumer- 
ator. For this reason, it would appear necessary to restrict the work 
to preferably two visits to the coast, and in no case more than three. 
The schemes of work satisfying this criterion and at the same time 
not involving more than 6-8 hours of work-load are (ii), (iii), and (iv). 
From Table 9 the variance of systematic sampling according to these 
three schemes relative to a random sample of 6 hours is seen to be 
213, 178, and 136 for April and 219, 172, and 119 for January. These 
figures indicate that scheme (iv), namely, that of systematic selection 
involving two visits of 3 hours each, is to be preferred to other schemes. 
Where, however, the journey time involved is smaller, the choice of 3 
visits of 2 hours each may be recommended. 


4. The Choice of Sampling Design 


The sub-sampling scheme for estimating the daily catch at any 
given centre described in the preceding section has been worked out 
on the assumption that a centre-day will be the first-stage sampling 
unit and the selected centre-day will be enumerated completely to the 
extent practical considerations allow the field enumerator to do it. 
The other alternative would be to observe a landing centre in succession 
for, say, D days, implying the choice of centre-day-group as unit of 
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sampling. If N denotes the number of landing centres along the coast 
there will be 30N/D such first-stage units in the population. Clearly, 
the larger the D, the smaller will be the number of first-stage sampling 
units in the population and the larger therefore will be the variance 
of the monthly estimate of catch for a given number of centre-days 
in the sample. As against this, the smaller the value of D, the larger 
will be the expenditure on travel for an equivalent sample of centre- 
days during the month. The determination of the number of successive 
days for which the observations should be made at a stretch at a landing 
centre amounts to the determination of the optimum size of the first- 
stage sampling unit and has to be made by weighing the loss in precision 
from increasing the size of the first-stage unit, against increase in 
travel expenditure from a larger number of landing centres in the 
sample during the month. 

As assumed already, let a month of 30 days (a day comprising 12 
hours, from 6 a.m. to 6 p.m.) be the period for which the estimate of 
total catch is required and let n be the number of landing centres in 
the sample to be selected out of the N centres along the coast after 
every D days. This is equivalent to stratifying a month in time strips 
of D days each and selecting an unrestricted sample of n centre-day- 
groups from the available centre-day-groups in each time-stratum. 
We shall also assume that the coast is divided into a number of suitable 
geographical strata and that, further, ” is distributed proportionally 
among them. Let y;; represent the catch on the j** day at the 7“ centre. 

The estimate of the daily catch in any time-strip of D days for 
the entire coast can then be written as 


(Suu /d) 


N 


No, 


D 
Yi. yis/D, 


and the estimate of the daily catch over all the k time-strips in the 
month as 


k n 


t=1 i=1 


i 
or 
where 
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where k = 30/D and 9;.(t) is like g;. used above but for the ¢** time-strip. 
Multiplied by the number of fishing days, this would provide an esti- 
mate of the total monthly catch along the coast. 

We shall assume that ;; is determined without error, which assump- 
tion, aS we saw, is reasonable even if the field enumerator were to work 
for only 8 out of 12 hours. The variance of the estimate would then be: 


DSUN 


30nN (4) 


where 


S? = the mean square between centre-day-group means within 
time-space stratum. 


Now the cost of the survey for one month can be considered to 
be made up of two components, one representing the remuneration to 
field staff for days of field work, and the other representing the cost 
of travel between centres. The second component will theoretically 
vary with the distance to be traveled between the centres within strata 
over all strata. In practice, this latter component would roughly vary 
with the number of centres in the sample, n, in each time-stratum, 
and hence be proportional to 30n/D. The total cost on the survey 
can, thus, be represented by: 


C = ¢,(30n) + c,(30n/D) (5) 


where c, and ¢, are constants representing the salary per man-day and 
the average travel cost between two centres. 

S? will presumably decrease as D increases, but at a rate slower 
than 1/D, so that the variance of the estimated catch would presumably 
increase with D. The cost, on the other hand, is seen to decrease as 
D increases. The optimum value of D is that value for which the cost 
of survey, C, given by (5) is the least for estimating the monthly catch 
with the desired variance V. From (4) and (5) we get, neglecting the 
finite population correction factor, 


CV = + e) (6) 


Clearly, (6) represents the inverse of information per unit of cost 
on survey and therefore its minimum value will provide the optimum 
value of D, when V is fixed and C is minimised. 

The value of (6) depends upon the value of S?., which in its turn 
depends upon the value of D. The relationship between S? and D 
can only be ascertained from the study of actual data and is given 
in Table 10 separately for the counts and the daily catch. The relation- 
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TABLE 10 


RELATION BETWEEN oF Time Cuiuster (D Consecutive Days) AND 
CoRRESPONDING Squarkes BETWEEN CENTRES FOR NUMBER OF 
Lanpines Per Day Datiy Catcu, Basep on Data CoLLEcTED 
at 12 Centers Durinc Peak WEEK (7TH), AVERAGE WEEK 
(11TH) anp Poor (22npD) 
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ship is based on the study of the data collected for three weeks, one 
during the busy period, one during a period of average activity, and 
one when the fishing activity is low. The table shows that as D in- 
creases, S; for count decreases, but at a rate slower than 1/D. So 
far as the daily catch is concerned, the rate of reduction in the value of 
S? is even slower. 

Fitted values of S? using Fairfield Smith’s function S’/D’ are given 
beside observed values in Table 10. The relationship shows that 
the value of g, which should be 1 for independent random sampling 


Number of con- 
secutive days in 


7th week 


11th week 


22nd week 


time cluster (D) 


Observed Fitted 


(S?/D?) 


Observed Fitted 
(S*/D») 


(S?) 


Observed Fitted 
(Se)  (S?/De) 


m.s. for number of landings per day 


oar whe 


35313 
36530 
25852 
19588 
17678 
16087 


41100 
29165 
23862 
20694 
18532 
16933 


4791 4795 
4674 4248 
4004 3895 
3539 3646 


41100 
0.49 


m.s. for daily catch 


10165 
10317 
10408 
10473 
10523 
10565 


1980 1984 
2343 1867 
1875 1789 
1682 1730 


10165 
—0.02 


| | 
6120 5712 5679 5900 
4908 4788 
3564 4308 
3816 4008 
| 3804 3780 
4140 3600 3275 3452 
5712 5900 
9 0.26 0.30 
— 
10234 3706 3929 | 1989 2200 
8142 4066 3340 
12559 2767 3087 
14244 2557 «2839 
10166 2603 2695 
: 8388 2836 2582 1459 1684 
g 0.23 0.15 


ESTIMATING CATCH OF SEA FISH 91 


TABLE 11 
INTRACLASS CORRELATION COEFFICIENTS (p) FoR Daity Count or LANDING 
Boats AND Datty Catcu Wirsin Cuiusters or D Consecutive Days, 
BaseD ON Data CoLLEcTED at 12 Centres Durine 71H, 11TH, 
AND 22ND WEEKS. 


Values of p 


Number of con- 


secutive days in 7th week 11th week 22nd week 
cluster (D) 


of days, is very small and of the order of .1 or .2 for the daily catch. 
In other words, the data show that there is a high correlation between 
the total catch on successive dates. This is confirmed by reference 
to Table i! which gives values of intraclass correlations for the count 
and the daily catch within clusters of D consecutive days. It would 
appear that unless cost considerations point otherwise, it will not be 
worth while increasing the duration of observation at a selected centre 
for more than a day. 


Substituting S? = S’/D’ in (6), we have 
CV = S’(¢,D + c2)/D’. (7) 
This can be shown to be a minimum when 


D = e2g/e,(1 — 9). 


For g = 0.2, we have 
D= C2/4c, 


We conclude that, unless c, is larger than 6 c,, the optimum value of 
D will be unity. Since in practice c, will be well within this limit, the 
choice of centre-day would appear to be the optimum choice of first- 
stage sampling unit. 

The next step in developing this sampling design is to determine 


ia 
Daily Daily Daily Daily Daily Daily zs a 
count catch count catch count catch eS sc 
2 0.99 0.46 | 0.82 0.64 | 0.78 0.88 ry 4 
3 0.86 0.46 0.60 0.52 0.75 0.92 : 4 
4 0.77 0.69 0.52 0.72 0.85 
5 0.71 0.30 | 0.69 0.56 | 0.72 0.81 ees. 
6 0.70 0.27 0.71 0.60 0.73 0.74 
| 
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the number of landing centres to be selected every day in order to 
estimate the monthly catch with desired precision. This is obtained 
by substituting D = 1 in equation (4). Assuming that the finite popu- 
lation correction factor is unity and using 100\ to denote the per cent 
standard error with which it is desired to estimate the daily catch per 
month, we have 


n = (8) 


where @ is the average daily catch per centre. 

As can be seen, the value of n depends upon the value of S? which 
in its turn will depend upon the degree of geographical stratification. 
The calculations based upon the data collected at 61 landing centres 
along the 100-mile stretch of the Malabar coast during 1950-51 relate 
to the stratification of the coast-line into 5 sections and are presented 
in Table 12. The table shows that the number of landing centres to 
be selected daily for estimating the monthly catch with 5 per cent 
standard error varies from 15 to 22. There is an indication that the 
number required for lean weeks is larger. 


TABLE 12 


NumBer or CENTRES (n) TO BE SELECTED Every Day ror Estimatinc MontHLYy 
CatcH PER Hour Wits Varyine Per Cent STANDARD Errors (s.e.) 


Percent Value of desired 
s.e. = 100A variance of Value of n 
7th week estimate 


111.44 (md/hr) 

13441 (md/hr)? 31.05 
124.19 
279.42 


22nd week 
9 = 20.73 (md/hr) 
S? = 645 (md/hr)? 


Ordinarily, estimates are not required for each month. They are 
required for the season asa whole. On the other hand, separate estimates 
are desired for principal species of fish. The two requirements in- 
fluence the sample size in opposite directions. We have not yet com- 
pleted investigation of this question. 

In practice, the sampling design described above would need modi- 


= $$$ 
3 j= 
Si = 15 
4 
2 
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+. 10 4.30 6 
i 15 9.67 3 
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ficailon owing to time lost in holidays and traveling (not all thirty 
days of the month will be available for observation). If we assume 
that d days are available for each enumerator every week for obser- 
vation, a simple and practical way to modify the design is to select 
a sample of dn centres every week, divided into n samples of d each 
and assign each sample of d randomly for observation by an enumerator. 
The modified design would of course necessitate a larger sample of 
centres than is shown in Table 12, the increase depending upon the 
value of d. Considering, however, the long length of the coast for 
which the estimate will be normally made, even doubling the sample 
size would not appear excessive. 


5. The Description of Surveys 


Techniques described in the previous sections have been evolved 
in the course of a series of expanding sample surveys beginning with 
the pilot survey in 1950. This pilot survey was conducted along the 
100-mile coast of Malabar. The 61 landing centres which accounted 
for the fishing activity of all the villages on this coast were divided 
into 12 strata of geographically contiguous centres. The first nine 
strata contained 6 centres each; the 10th stratum contained 5 centres 
only, while each of the two remaining centres was taken to constitute 
a stratum by itself, because of the relatively high intensity of fishing 
reported at these centres. A centre-week was the first-stage sampling 
unit. The work in each stratum was entrusted to a pair of enumerators 
who moved after each week from one centre to another, completing one 
round of all centres in a cycle of as many weeks as the number of centres 
in the stratum. The field work in a week covered a period of 6 days 
from Saturday to Thursday. Friday, which is observed as a fishing 
holiday, was utilized for travel from centre to centre. On each day 
the work was spread over a continuous period of 14 hours from 5 a.m. 
to 7 p.m., divided into 2 equal shifts of 7 hours each. Each enumerator 
was assigned to one shift which he exchanged with his associate after a 
three-day period. The work of supervision was entrusted to 3 in- 
spectors, each having the jurisdiction of 4 continguous strata along the 
coast. All the field staff employed for the survey was on a full-time 
basis. 

The method for keeping a count of landing boats and for recording 
the catch of fish per boat was as follows. Each hour was divided into 
3 equal intervals of 20 minutes each. One of the three intervals was 
utilized for counting the number of landing boats during that interval, 
while the first boat landing during the remaining 2 intervals taken 
together was contacted for recording the catch. The interval fixed for 
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boat count was the first one on the first day of each half week (Saturday 
to Monday), the second one on the next day, and the last one during the 
third day of each half week. The survey was conducted for one full 
fishing season of 30 weeks’ duration from the 2nd of September 1950 
to the 3lst of March 1951. 

In regard to night fishing, enumeration of the landing of boats 
and catch was not found practicable. Instead, the enumerator of the 
morning shift was instructed to go to the coast 20 minutes earlier than 
the usual time and make a count of the boats that had landed after 
7 p.m. of the previous day and that were still on the shore at the time 
of his visit. A sample of 6 boats was selected out of these for recording 
the catch by interrogation of the fishermen and their mates. 

The sampling plan adopted for the day fishing may be regarded as 
a stratified 3-stage sampling design with centre-weeks forming the 
first-stage units, days forming the second-stage sampling units, and 
intervals within days forming the third-stage sampling units. Actually, 
the sampling fraction adopted at the second stage was 100 per cent and 
at the third stage was also very high. The plan adopted for the survey 
could thus be regarded as complete enumeration of the selected sample 
of centre-weeks within each stratum. Another feature of the plan was 
that the sampling of centre-weeks within each space-time stratum 
(of 6 weeks for the first 9 strata and 5 weeks for the 10th statum) was 
systematic in space and time. 

The catch for any hour was estimated as the product of three times 
the number of landings recorded in a 20-minute interval and the corre- 
sponding average catch per landing boat. The hourly catch was summed 
up over all the hours of the day and over all the days of the week to 
provide an estimate of the quantity of fish caught during each centre- 
week. Multiplied by the number of centres in each stratum in the 
population, it provided an estimate of the total catch for each space- 
time stratum. The estimate of the total catch for the entire coast and 
over the full fishing season was built up by summing up the corre- 
sponding estimates for the individual space-time strata. This estimate 
amounted to 6,238,000 maunds of fish for the 30-week period covered 
by the survey. To the estimated total so obtained for day fishing 
was added the estimated contribution from night fishing amounting 
to 370,000 maunds. 

The standard errors were calculated from the variation between the 
centre-weeks in each space-time stratum. This is not strictly justified 
in view of the systematic nature of the selected sample in space and 
time. However, there was no evidence to suggest that the selected 
sample would behave appreciably differently from a randomly selected 
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sample. In fact, a study for the successive weeks at the two large 
centres, Quilandi and Tanur, showed absence of significant positive 
interclass correlation between centre-weeks systematically placed in 
space and time. The standard error so calculated was found to be of 
the order of 5 per cent of the estimated catch for the entire fishing 
season. 

A similar pilot survey was carried out in 1952-53 on a 300-mile 
strip on the east coast of Madras State. Space does not permit a 
description of the survey and of its results. It will suffice to state that 
the survey confirmed the conclusions reached in the course of the first 
pilot survey. 

The next phase of the work was to utilize the experience of the two 
pilot surveys for organizing large-scale surveys over the entire coast 
to which the results of the respective pilot surveys appeared to be 
applicable. The first such extended survey was conducted on the west 
coast during 1953-54. Simultaneously, a typical strip of the Canara 
Coast where fishing is largely done with shore-seine nets was taken 
up for investigation. The design of the survey was modified to suit 
the participation of the departmental staff in field and supervisory 
work. The investigational stage being over, the surveys are now being 
taken over by the National Sample Survey Organization in order to 
extend them to the entire coast of India as a normal method of estimating 
monthly catch of marine fish. 


6. Summary 


The paper describes a sampling method developed in India for 
estimating the monthly catch of marine fish brought to the coast by 
fishing boats. It is divided into two parts: the first dealing with the 


e sampling procedure for estimating the daily catch at selected sections 
a of the coast and the second dealing with the choice of optimum first- 
d stage unit for sampling and the number of such units required for 
a estimating the monthly catch for the entire coast with given precision. 
ie The sampling procedure for estimating the daily catch is based 
d on the study of hourly landings at a number of selected sections along 
1g 


the coast, and it is concluded that a systematic selection involving 
1g two visits of three hours (or three visits of two hours, in case the journey 
time to the coast is shorter) during a day is both a practical and efficient 


scheme of sampling at selected sections. The section of the coast found \" 
most suitable for use as a sampling unit for observation is a landing 
centre. The paper then gives the results of a study to determine the 
optimum number of successive days for which a selected centre should 
be observed, based on the data collected at 61 landing centres for two 
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months, and it is concluded that a centre-day is the optimum unit of 
observation. The number of centres to be selected daily for estimating 
the monthly catch for the entire coast with 5 per cent error is placed 
between 15 and 22. 

Finally, a brief description is given of the surveys conducted in 
India in the course of which the above technique was developed. The 
present surveys cover nearly 1200 miles of the east coast and 700 miles 
of the west coast. It is proposed to extend the survey to cover the entire 
coast of India as the normal method of estimating the monthly catch 
of marine fish. 
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THE METHOD OF MAXIMUM LIKELIHOOD APPLIED TO 
THE POISSON BINOMIAL DISTRIBUTION 


D. A. Sprorr 
Department of Psychiatry, University of Toronto, Toronto, Canada 


Much has been written about the application of contagious distri- 
butions to plant and insect populations; for a discussion of the methods 
and a review of the literature see McGuire, Brindley, and Bancroft 
[1957]. Three distributions considered in the past are the negative 
binomial, the Neyman Type A, and the Poisson binomial. The method 
of maximum likelihood has been used to fit the negative binomial, as 
described by Fisher and Bliss [1953], and to fit the Neyman Type A, as 
described by Douglas [1955], and by Shenton [1949]. 

It is the purpose of this paper to describe a procedure for fitting 
the Poisson binomial distribution by the method of maximum likeli- 
hood, and to consider the efficiencies of the method of moments and the 
method of sample zero frequency. Since the Neyman Type A dis- 
tribution is a limiting form of the Poisson binomial, the formulae 
developed here will include as special cases some of the formulae de- 
veloped previously for the Neyman Type A. 


MAXIMUM LIKELIHOOD EQUATIONS 
‘The Poisson binomial distribution is 
(nt 


it is assumed that n is a known integer. The P(k) obey the recursion 
formula (McGuire, e¢ al.) 


where 
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This recursion formula can be written 


t n-t-1 
The mean and variance of the distribution are apn and — + 
(n — 1)p). 
Setting 
t=0 q 0a t! k q 
we have 


n 


Also, 
Si(k) = e’P(k). 
In a similar way it can be shown 


aS, 
— 


k 


sk + 1). 


Since 


log P(k) = —a+ log 8S, , 


the maximum likelihood equations are 


ra, 


where a, is the observed frequency of k and N is the total number of 
observations. These equations can be written 


nip = k, 
L@) = 


(3) 


| 
| 
N = 0, 
| 


POISSON BINOMIAL DISTRIBUTION 


where 


+0 


and 4, p are the maximum likelihood estimates of a, p. 
NUMERICAL SOLUTION OF THE MAXIMUM LIKELIHOOD EQUATIONS 


Proceeding as usual, it is necessary to evaluate L’(p), where 4 is 
replaced by k/np. But 


so that 
= Ya, + 1p). (5) 


Employing the method used in deriving (2), and noting that 


da_ _4 
dp p’ 
it is only a matter of laborious computation to show that 
dS, _ gk + + 2) E + 
dp -(1 + 4) nap e’P(k + 2) + 


Substituting these values into (5), using (4), and simplifying, gives 


L'é) = Lak (1 + 4 na ara |. (6) 


If #’ is a trial value for #, then a bl approximation is 


the P(k) can be calculated recursively by (1). By letting n > o, 
p — 0, q— 1, so that np > m, , a > m, , the corresponding formula 
for the Neyman Type A distribution can be found as 


= lim = aF()| (1 ar |, 


which is equivalent to the form given by Douglas [1955]. 


| 
A 
nap P(k) 
ic 
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4 
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Example. The following example will give an indication of the re- 
quired calculations; for n = 2, the calculation of the P(k) is relatively 
simple, as each P(k) depends only on the two preceding P’s. Consider 
distribution (6) of McGuire, et al.: N = 1296, k = .410494, n = 2, 
and s* = .512050. Using the method of moments, the estimates are 
p’ = .2474 and a’ = .8296; the following calculations are performed: 


TABLE 1 
Computations For L 
k P(k) (k + 1)P(k +1) F(k) a,F(k) 
0 .697875 . 215600 . 752600 682.608 
1 . 215600 . 137480 1.553402 427.186 
2 .068740 .043131 1.528527 134.510 
3 .014377 .011424 1.935723 44.521 
4 .002856 .002370 2.021545 6.065 
5 .000474 -000438 
1294.890 
Thus the first trial value of L is 1294.890-1296 = -—1.110. The 
correction to apply can be calculated as follows: 
TABLE 2 
CoMPUTATIONS FOR L/ 
(1) (2) (3) (4) 

k AF(k) (1 + q/np)nadF(k) |(n —1)/np—(2)| axF(k)(3) 
0 800802 3.349742 —1.328715 —906.991 
1 — .024875 — .104052 2.125079 907 .804 
2 .407196 1.703294 .317733 42.738 
3 .085822 . 308992 1.662035 73.995 
4 . 129525 .541801 1.479226 8.972 

L’ =126.518 


Thus the next trial value is 


p’’ = 2474 + 1.110/126.518 = .2562. 


| 
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The expected values for p = .2562 and .2474 are compared in the 
following table: 


TABLE 3 
Exrectep LIKELIHOOD (ML) anp METHOD oF 
Moments (MM) 
Expected (ML) Observed Expected (MM) 
0 906.09 907 904.44 
1 276.61 275 279.42 
2 89.89 88 89.09 
3 18.85 23 18.63 
+ 3.80 3 4.31 
xtz) = .55 xi) = .61 
L = .036 L = —1.110 


VARIANCES OF THE MAXIMUM LIKELIHOOD ESTIMATES 


The information matrix can be found by differentiating log P and 
summing, noting that 


kP(k) = nap, = nap[l + (n — 1)p] + 


aP(k) _ np + (n — 1l)pq 

| = + nal 1 n+4), 
aP(k) || dP(k) 


where 
A=-1+ 
The determinant of the matrix (I/N) is 
Thus 


3 
— —— 
= 
| 
t 
| 
: 
7 
I 
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By letting n > ~, p— 0, so that np — m, and a > m, , the correspond- 
ing formulae for the Neyman Type A distribution can be found: 


it, 2 
N = A, = yy = mA 


and 
Me 


as given by Shenton [1949]. For n = 2, the formula for A can be 


simplified to 
P*(k — 1) 


EFFICIENCY OF THE METHOD OF MOMENTS 


Using the methods of Fisher [1941], the determinant of the co- 
variance matrix of the estimates found by the method of moments is 


{n*a’p*[1 + — l)p + 6m — 1)(m — 2)p’ 
+ (n — Im — 2)(n — + — Dp) 
+ 2n*a*p*[1 + (n — — n’a’p*[1 + — 1)p 
+ (n — 1)(n — 2)p*P} — 1)’a*p*N?. 


4 For n = 2, this simplifies to 
q + 2a + 6ap + 6ap* + 2ap* 
2pN* 
TABLE 4 
EFFICIENCY OF THE METHOD OF MoMENTS FOR n = 2 IN PERCENT 

a 1 5 8 1 2 

Pp 1 93 89 84 84 87 

- 2 84 68 67 68 75 
3 73 53 51 53 64 


4 
a 
4 
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As n — o, for the Neyman Type A distribution, this determinant is 


2 + + ms + 2m (1 + ms)* 
m,N* 


The reciprocal of the efficiency is D,,, multiplied by the appropriate 
determinant above. For any given a, > 0 as p— 1. 


EFFICIENCY OF THE METHOD OF SAMPLE ZERO FREQUENCY 


In this method, the observed proportion of zeros is used to form 
the second equation of estimation. The equations are 


nap =k = m 
a = NP(O) = 
For n = 2, these equations are 


m = 2ap = k 


ao 


N 


Using the method of Anscombe [1950 (p. 368)], noting that if f, = 1, 
f; = 0), then 


p= 2+ log 


E(f;) = F(m, Pp) = P(O) 


so that 
A. = 2F(m, p) — 
om np 
_ 9F(m, p) _ PO) 
thus 
cov(p, m) = = = ¢) = np} _ pa 


— — npq"") N’ 


Nm’P(0)(1—q"—npq™ 


and 


_ + (nm — 


var m 


N 
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For n = 2, the covariance matrix of the estimates m and p is 


m(1 + p) Pq 
N N 
Pq 4{P(0)[(1 — 3p)’m(1 + p) — m(2 — p) — 1] +1} 
N Nm’‘P(0) 


The information matrix for the parameters m and p can be shown to be 
1 
= Da.» - 


Thus, as in the method of moments, the efficiency can be found. 


TABLE 5 
EFFICIENCY OF THE METHOD OF SAMPLE ZERO FREQUENCY (n = 2) IN PERCENT 
1 5 8 1 1.5 2 
p 1 99 98 99 99 
2 96 94 96 97 
3 94 88 90 92 
4 86 70 70 72 74 85 
TABLE 6 
Potsson BrnomiAu Distrisution, N = 1296, n = 2 
k Expected value | Expected value | Expected value | Observed value 
Method (1) Method (2) Method (3) 
0 461.53 423 .00 427 .67 423 
1 349.79 394.33 389.30 414 
2 259.29 263.09 262.04 253 
3 129.54 131.03 131.00 117 
4 60.14 55.19 55.81 53 
5 23.34 20.11 20.56 22 
6 8.45 a) 6.81 4 
7 2.75 1.95 2.05 5 
8 .84 .54 .57 3 
9 .32 .19 15 2 
-1.117 | 0<L< 
= 17.47 x@ = 5.57 | = 5.71 


4 
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For p < .3, the efficiency is around 90% or more. For p > .3, and if 
a is not too large, it may be necessary to use maximum likelihood 
estimation. For any given a, 0asp— 1. 


EXAMPLES 


Example 1: distribution (4) from McGuire, et al. 

(1) The method of moments applied to the Poisson binomial with 
n = 2 gives p = .420; (2) the method of sample zero frequency gives 
p = .2868; (3) the method of maximum likelihood gives p = .304. 

Example 2: distribution (7) from McGuire, et al.:n = 2,N = 324. 
Methods (1), (2) and (3) above, applied to the Poisson binomial give 
the estimates p = .5798, .4262, and .4016 respectively. 


TABLE 7 
Poisson Distrisution, N = 324,n = 2 

k Expected value | Expected value | Expected value | Oberved value 

Method (1) Method (2) Method (3) 
0 100.97 89.00 87.22 89 
1 69.66 83.86 85.70 96 
2 72.09 70.65 70.86 57 
3 38.69 41.75 42.05 44 
4 23.83 22.19 22.01 16 
5 10.65 10.02 9.87 11 
6 5.01 4.16 4.04 7 
7 1.94 1.56 1.50 3 
8 1.16 .75 1 

xd) = 18.90 xis) = 9.40 xi) = 9.88 
L=17 L = .779 L = .000 


These examples illustrate the fact that the method of moments 
may provide quite misleading estimates of p when this parameter is 
not very small; the method of sample zero frequency remains reasonably 
efficient for considerably larger values of p. 
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A MULTIVARIATE ANALYSIS OF COVARIANCE* 


H. Smitru** 


North Carolina State College 
Raleigh, North Carolina, U.S.A. 


THE PROBLEM 


DeLury [1948] presented a covariance analysis with several charac- 
ters which became somewhat complicated, both because the magnitude 
of treatment effects increased with treatment duration, and because 
one of the variables which originally had been intended for use as a 
concomitant control was affected by treatments. No satisfactory form 
of analysis for dealing with these complications was reached, and the 
problems raised seem worth re-examining. 

The weights of hind-leg muscles of rats, one intact (v), one denervated 
(u), were observed at 4, 8, and 12 days after operation followed by 
treatment twice per day with one of four drugs (of which D was merely 
control saline). There were four randomly chosen rats in each of the 
twelve groups. Initial (x) and final body weights (y) were also observed. 

The objective of the experiment was to determine whether the 
drugs delay atrophy of denervated muscles. Apparently it had been 
hoped that drugs would not affect intact muscles, so that they could 
be used as concomitant control. But inspection of the data revealed 
that drugs seriously affected both intact muscles (v) and final body 
weights (y). The question was then asked whether a drug might delay 
the rate of decay of denervated muscle relative to the shrinkage to be 
expected if there is no drug X denervation interaction. The common 
empirical definition of null interaction for additive effects is, however, 
inappropriate and irrelevant: shrinkages caused by drug and denervation 
cannot possibly be additive when measured in grammes or as propor- 
tions, and there is no hypothesis to define how the two effects should 
superimpose and thence to indicate a transformation by which null 
interaction would be additively expressed. 


*Part of a general review of analysis of covariance methods: sponsored by the Office of Ordnance 
Research, U.S. Army. Another part was contributed to the symposium at Detroit, September 1956, 
and published in Biometrics, September 1957. 

**Present address: Statistical Center, University of the Philippines, Manila, Philippines. 
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Having observed that v and y are affected by treatments, DeLury 
duly noted that initial body weight (x) is the only variable for un- 
ambiguous use as concomitant control in the usual way. He never- 
theless continued with an endeavor to use error regression on v or y 
in similar manner as a means of allowing for drug effect on healthy 
muscle in an attempt to answer the above question. Much of his paper 
is a discussion of the difficulties into which one may be led by such 
procedure. In effect he reached the conclusion that it is not a very 
meaningful attack. 

He then endeavoured to develop a more elaborate model which 
might better represent a plausible relation among the variables, and 
to deduce from it a relation among the observations which could be 
put to statistical test. The idea is good, but for a number of reasons 
the derived regression fails to represent, and to provide a test for, the 
hypothesis. The chief reason is that v is brought into the hypothesis 
as an empirical substitute for unknown functions of time representing 
the rates of growth or shrinkage of intact muscle with each drug. 
Contrariwise, the estimated regression is derived from variation within 
treatments, each at a single fixed time, where variation of v can be 
only a feature of varying sizes of rats and has nothing whatever to 
do with change of size over time as required by the hypothesis. Con- 
sequently, the coefficients of the estimated regression are not even 
remotely estimates of the respective symbols of the hypothesis. In 
fact the proposed test ends up by being in effect similar to the dis- 
carded tests of sections I and II, but disguised and modified by having 
been incorrectly weighted by times of observation. 

The usual analysis of covariance procedure has been frequently 
used to “adjust”? the experimental variate for two characters such as 
x and v (or z and y), one of which has been affected by treatments. 
But no paper which I have seen seems to recognize that the adjustments 
are incompatible; that estimates of the adjusted means are estimates 
of an artifact which cannot exist. If drugs differentially affect v, it is 
unrealistic to postulate that similar animals on different drugs can 
have simultaneously both equal z and equal v: a value of u adjusted 
for such a postulate seems meaningless, being analogous to estimating 
a function of and x” for = a and = b’ whena b. What then 
should be done with data of this sort? 


DEFINITION OF TREATMENT MEANS AND EFFECTS 


I think we must look at the intact muscle as a dependent experi- 
mentally varied quantity with the same status as the denervated 
muscle. We could make a split-plot type of analysis to compare the effect 


4 
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of drugs on the average of both muscle conditions (main-plots) and on 
the differences between the two. But knowing a priori that differences 
will increase with time, it is almost certain that their rate of increase 
must be affected by treatment if only because the potentiality for 
shrinkage is limited. One may therefore forecast that denervation X 
drug interactions, as defined by the usual additive model for a split- 
plot analysis, would be complicated. When interactions are complex, 
the only thing to do is to study the individual treatment combinations, 
summarizing where a systematic pattern can be found. Here we have 
one simple quantitative treatment, time, whose effect should be system- 
atically progressive. We may therefore reduce the number of de- 
scriptive statistics required by considering regressions on that factor. 

Shrinkage being inevitably limited it cannot progress linearly. 
Proportionate differences may, however, be longer sustained, and by 
using logarithms we may be able to flatten the curves enough for linear 
regressions to be at least first approximations. One procedure (con- 
sidered in part by DeLury [1948] at p. 166) would be to combine the 
three times for each drug into estimates of mean, linear, and curvature 
effects, and then compare these between drugs, hoping that curvatures 
may be small and negligible. But since at zero time (which was not 
observed) all muscles start from similar sizes for equal sized rats, the 
means and linear terms would be telling the same story. Can we con- 
centrate both pieces of information in one statistic? 


Notation: For brevity let logarithms be denoted by capital letters: 
X = logyxv, Y = logy, etc. Let Z stand for any one of the variates 
Y, V, U, or for a difference between any two of them. X, being the 
independent variable of regressions used to adjust estimates for initial 
body weight, will usually be separately distinguished; subscripts attached 
to it will have the same meaning as for Z. 

Z;, denotes the mean of the four observations on Z for rats on drug 
i at time t;7 = A, B, C or D; t = 1, 2, 3; = where k 
denotes individual observations within treatments. 

A subscript dot denotes an average over the replaced subscript, 
that is 


and the mean for all 48 observed animals, which will be required only 
for X, is 


Zi. 
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Lower case b (without subscript) denotes the regression of Z on 
X as evaluated from the within treatments sums of squares and products 
(Table 2). The respective dependent variate in each case will be evident 
from the context. Examination of the regression on X for each treat- 
ment separately suggested possible heterogeneity. Though there were 
hints that the regression might be affected by drugs there was no 
sufficiently consistent pattern to indicate any better estimable adjust- 
ment for initial body weights than by the overall average remainder 
regression as in standard analysis of covariance practice. 

Adjusted means are denoted by a prime, 


i= Zin (Xi. X..). (1) 


The common value, X.. , to which each observation is nominally ad- 
justed, is arbitrary. Adjusted weighted means to be defined later 
would theoretically be most precise at the corresponding weighted 
mean of X (at which point the adjusting term would vanish). For 
example, Z’, is most precisely evaluated at X., ; the average precision 
of Z/, would be greatest at X.,. However, it is convenient to consider 
all estimates as adjusted to one common single value of X; the overall 
arithmetic mean, as usually used, seemed as good a choice as any. 
Contrasts are of chief interest and these are not affected since any 
arbitrarily chosen value disappears from differences which are measured 
as distances between regression lines assumed to be parallel. For 
example, the adjusted difference between twc treatment means is 


from which X.. [or any other arbitrary central value which might have 
been used in (1)] has disappeared. However, it will be seen later that 
adoption of X_, would have produced a slight simplification in Formula 


(5). 


Definitions: Assuming a linear regression on time, estimated for rats 
on drug t by 


the least squares estimate of mean size at zero time (the regression 
intercept at t = 0) is 


Zio = Zi. = = (4Z + Zia 2Z3)/3. 


To indicate possible curvature we consider the quantity Q, propor- 
tional to the quadratic coefficient of a parabolic regression, as is 


qe 
kd 
he 
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commonly used in analyzing data from a fertilizer trial at three levels 
(Yates [1937]): 


Q; = Zia 2Z i2 + 


Q; is a linear function of the observations orthogonal to Z;, . The 
third linear function orthogonal to both of these must then be propor- 
tional to the weighted mean: 


Zi = (Zin + + 3Z;3)/6. 


These three linear functions contain all the information available in 
the three means from which they have been compounded, and, supposing 
normally distributed observations with constant variance, they are 
statistically independent. They are conveniently exhibited for reference 
in matrix form by Table 1. 


TABLE 1 
DEFINITION OF TIME EFFEcTS 
Za Zi2 Zis 
3Z io 4 1 -—2 
6Z:, 1 2 3 
Q: 1 —2 1 


If the linear regression model be adequate all Z;) should be equal 
within sampling error, all Q; should be approximately zero, and hence 
all drug effects should be efficiently represented by contrasts among 
the Z;, . These are similar to the linear function of yields suggested 
by Fisher [1935, Sec. 50] for comparison of fertilizer qualities over 
several levels of application. 

Assuming linear regression on time, Z;, is an interpolated estimate 
of the character at ¢ = 7/3, or at 28/3 days since 4 days has been taken 
as the unit of time. If, as when Z = V — U, we can assume that the 
true value is zero when ¢ = 0, then 


3Z:./7 = >, 


is the least squares estimate of the slope of the regression forced through 
a known origin. Its error variance is [var(Z;,)]/14. When the intercept 
is unknown then 


3(Z;- Zio)/7 (Zis Zin)/2 


= 
ay 
+ 
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is the usual estimate of slope of the regression through the observed 
mean with variance = [var(Z;,)]/2. However, even when the intercept 
is not known, it should be the same for all four drug treatments; hence 
we may fit 

Zi =at Bit (2) 


where the constant a is the same for all drugs. The least squares esti- 
mate of a is then easily found to be Z., , and that of 8; is 3(Z;, — Z.o)/7 
with variance now only 5{var(Z;,)]/28. Furthermore when comparing 
the regressions for two drugs the common constant drops out from their 
difference so that its estimate is simply 


(8; B,) 3(Z;, Z;)/7 
and has variance [var(Z;,)]/7, the same as for a comparison between 
two regressions through a known origin and more precise than the 
estimate of either coefficient individually. 
The factor 3/7 being somewhat a nuisance, and dependent on an 
arbitrary time scale, we can conveniently redefine the regressions on 


time as 
BZ) = 78;/3 = — Zo 


where Z, may be either a known or estimated intercept at t = 0. B,(Z) 
is then an estimate of the rate of change per 28/3 days for the character 
Z with drug 7. 

If the linear regression model does not in fact fit, Z;, and contrasts 
between them are still valid for the specified weighted means, although 
interpretation may not be quite so simple and they will no longer 
contain all the information about drug effects. For example, suppose 
true regressions are 

Zi = at Btt 
without a maximum or minimum within the relevant range (that is, 
| 7/8 | < 1/6 when 0 < ¢ < 3) then Z;, as defined for observations at 
t = 1, 2, and 3, estimates the yield at approximately 
7 5y 

This will not usually differ much from 7/3 specified by the linear 
hypothesis. The estimate Z;) will be biased from the true yield (a) 
at zero time by —10y,/3. 


TESTS OF SIGNIFICANCE 


‘ Sums of squares and products of effects as defined in Table 1 (with 
appropriate divisors) lead to the analyses of variance and covariance 
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exhibited in Table 2. Since Z., and Z., are estimates of yields, as 
distinguished from contrasts, they are not independent of origin and 
their squares are of no interests. For convenience of computation 
Table 2 is based on X and Y reduced by 2.25, V and U increased by 
0.25; and all multiplied by 10°. 

Average curvature, Q. , with only one degree of freedom, is con- 
veniently evaluated by a t-test (which of course gives the same test of 
significance as would the ‘“‘reduced square’”’*). By the usual formula 
the estimate Q’ for a character Z adjusted for X is 


Q'(Z) = QZ) — Q(X). 


Correspondingly (see below) its variance is 


varia) = | 


where s7 = variance of a single observation about the error regression 
on X (see Table 3), 


6= 1? + 2? 4+ 1’ = s. sq. of coefficients of Z., in the linear 
function Q. , 


16 = number of rats in each Z., , 
6.3 = Q(X) X 10°. 


143891 = 8.Sq.(X) from which the regression was evaluated 


(Table 2). 
This leads for each character to (for logarithms X 10°) 
Q(Y) = = 9.41 + 16.75 
Q'(V) = —23.37 + 27.88 
= 9.00 + 39.99 


— U) == 14.88 + 31.88. 


Clearly average curvatures are indetectable relative to experimental 
error. 

Table 3 presents the reduced sums of squares, derived in the usual 
way, to test significance of variation between drugs after adjusting to 
common X. The results are clear cut without need for reference to the 


*I follow Fisher (Statistical Methods, Sec. 46.1) in using the term ‘“‘reduced mean square” to 
indicate a mean square whose expectation on the null hypothesis is equal to random variance about 
a fitted regression. With only one degree of freedom the word ‘‘mean”’ seems redundant. 
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F table. No differential effects due to drugs are detectable for estimates 
at zero time or for curvatures. Only two of these ten mean squares 
exceed their respective error variances, and they have individually 
P > .1. Therefore, taken along with the earlier finding that average 
curvatures are also approximately zero, the linear hypothesis is adequate 
to describe the data to the degree of approximation permitted by 
experimental error. Differential effects on rates of shrinkage are very 
highly significant for all characters. 


ESTIMATION OF ADJUSTED TREATMENT EFFECTS AND THEIR 
STANDARD ERRORS 


From (1) it follows (as expounded by Wishart [1936]) that the 
adjustment for initial body weights to be applied to any linear function 
of treatment means depends on the homologous linear function of X. 
Assuming normally distributed observations, b is independent of 
treatment means, hence the variance of an adjusted effect is the sum 
of the variances of the observed effect and of the adjustment. Func- 
tions of X are treated as constants, and 


var(b) = sz/dev’ X 


where sz is the estimated variance of Z about its within treatment 
regression on X (Table 3), and dev’X is the sum of squares of X within 
treatments, viz. .143891 (Table 2). 

Variation among Z!, being ascribable to random variability of 
observations, we can accept their mean, Z’, , as the common estimate 
for all treatments at zero time. Following the above rules 


and its variance is 


var(Z'o) = E 9 + devX sz = -14688s7 


The estimates and their standard errors are shown at the top of Table 4. 

If the linear hypothesis is adequate, we expect V’, = U’,. This 
is not disproven (¢ = 1.6). 

Y’, is less than X., by .02424 +.01048 and this is significant 
(t = 2.31, 35 df., P ca. .03). If both x and y were similarly observed 
live weights these should be equal. But a real difference is further 
indicated by the variability of Y within treatments being strikingly 
less than that of X. These body weights seem somehow to have been 
differently observed: y may have been dead weights after cleaning out 
excreta. 
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TABLE 4 


Estimates oF LoGarirams oF INr1TIAL WEIGHTS AND OF RATES OF SHRINKAGE (OR 
GrowTH) Per 9.33 Days ror at AVERAGE Size oF Rat (X,, = 2.3435) 


Drug U’ (V-U)’ (V-Y) 


Initial weights (Z’ .o) 


All 2.3192 .1245 (.1564) (—.0319) —2.1940 
S.e. .0105 .0174 .0250 .0199 .0129 


Regressions on time (B’;) 


D -0572 .0696 — .1398 . 2094 .0104 
B -0391 — .0108 — .2154 — .0520 
Cc — .0326 — .0902 —-.2111 .1209 — .0570 
A — .0613 — .2205 — .3558 1352 — .1432 
S.e. -0136 .0226 .0270 .0165 .0166 
S.e. of a diff. .0123 .0204 .0293 .0233 .0148 


Treatments are (each injected twice daily during period of observation): 
D — saline (control) 
B — moderate quinidine 
C — moderate atropine 
A — heavy atropine 
X = logior: x = live body weight at time of operation (gm.) 
Y = logy: y = body weight at time of killing (gm.) 
V =logv: » = intact muscle weight (gm.) 
U = logu: u = denervated muscle weight (gm.) 
Primes (’) indicate values adjusted for X. 


(V — Y).o estimates log(vo/yo) where v) and yo are muscle and 
body weights of an average rat before treatment. As should be expected, 
if the model adequately fits the data, it is practically the same as the 
logarithm of the mean ratio for the control treatments, D, at all times, 
namely, —2.1870. 

Drug effects are summarized by Z/, . However an indication of 
the rate at which a variate was growing or shrinking is more informative. 
Therefore Table 4 presents B/(Z) = (Z!, — Z’,). Z!,, if wanted, can 
be obtained by adding Z’, as listed at the top of each column; except 
that, for reasons stated below, U/, = Bi(U) + V’, , and (V — U)’, 
is assumed to be zero. 

Owing to variation in the magnitude of adjustment for initial body 
weights each estimate has a different error variance. But, for any one 
character these differences are not worth bothering about, having 
regard to the circumstance that s7 is itself only an estimate subject to 
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error and an average for variances which may be not quite homogeneous 
for all treatments. We therefore evaluate only error variances averaged 
for all four drugs. Following the rules indicated above, for Z = Y or 
V, we write the adjusted estimate as 


BUZ) = — — — X.0) 
whence the estimate of average error variance is 


1 14 

ave var[B{(Z) |= E ‘367 = = .24682s7 . (3) 

The average initial weight should be the same for both intact and 
denervated muscles and its logarithm is more accurately estimated by 
V., than by U., since sy is less than half of sf . Furthermore nothing 
is gained by averaging the two estimates, their true weights being 
unknown. Owing to the close correlation of paired muscles, the un- 
weighted mean has error variance greater than that of V.. , and weights 
based on estimated variances would introduce more troubles than 
would be compensated by a trivial potential gain of precision. We there- 
fore use V., as the preferred estimate of initial log weight for either 
type of muscle. The estimated regression of denervated muscles on 
time is then 


Bi(U) = Ut, = Vo by X..) by(X.o = X..) 


where we now have to distinguish two regression coefficients on X 
which are correlated. The average error variance is then estimated by 


ave — + [a+ (X x.) 


— X. (Xo — X.) , 
dev’ X 


where syy is the error covariance of U and V after adjusting for X. 


It can be derived as 
[UX][VX] 


where [ ] indicates a sum of squares or of products in the error row of 
Table 2; or from Table 3 as 


3(s5 + sy — sy-v) = 1812.0 x 107° (4) 


(Using U., and (3) the variance would be .001052 instead of .000730). 
Since theoretically (V — U). = 0, the regression on time for 


= .0007302 
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logarithm of the ratio of the weights of intact and denervated muscle 
takes the simple form 


for the special case Z = (V — U). 
For any character 
Zie = — — X..) 


and thence 
il >. (x. 2 
ave var(Z’,) [} 36 + 4 dev xX sz = .09985s, (5) 


The error variances of B‘ are only of interest for comparisons with 
other experiments, or to set confidence intervals for the rates of change. 
Of more immediate interest are the standard errors for comparisons 
between drugs, and we have the uncommon situation that such con- 
trasts have lower standard errors than the individual regressions for 
the reason noted under equation (2) [except for (V — U) and for U 
for which the intercept at zero time is either known or estimated from 
less variable observations]. A difference between regressions for two 
drugs is the same as comparison of the estimates of size as measured by 
Z;, , namely 


(Bi — Bi) = Zi, — Zi 
= Lis = b(X;, = 
To get the average variance we average the contribution for adjustments 


over all possible pairs of X; , X; , as suggested by Finney [1946]. The 
estimated variance of a difference is thus 


1:14, ave(X,, — M | 2 
| dev x dev? x (Sa) 
where M. sq.(X;,) is .015436/3 from Table 2. 

If we had minimized average variance of adjusted means (Z/,), 


by adjusting to X., instead of to X., , the corresponding substitution 
in (5) would make that formula equivalent to 


7 | 2 


The individual adjusted estimates are slightly correlated so that the 
average variance of a difference is slightly more than double the average 
variance of a single treatment mean: whereas in the last term of (5b) 
S.sq.(X;,) is divided by the number of treatments, in (5a) it is divided 
by its number of degrees of freedom. The difference is trivial. Indeed, 
for practical purposes in this experiment, where uncertainty of ad- 
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justments is responsible only for between 2 and 4 per cent of variances 
of estimates, it would have sufficed, as suggested by (5a), merely to 
have increased by 3 per cent all variances estimated as if adjustments 
were free of error. One must however be alert for exceptions and the 
purpose here has been to illustrate basic poncuiaee as modified by 
particular circumstances. 
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The contrasts (V — U) and (V — Y) are of course easily derived 
from the differences among means of the respective characters; but 
their variances depend on correlations between the characters. Instead 
of worrying with co-variances we can carry through computations as 
above with Z = (V — U), etc., as a single character; hence the reason 
for including squares of these differences in Tables 2 and 3. To save 
space, their sums of products with X are not shown since they can be 
derived from others. 

The comparison (V — Y) = log (v/y) is incidental to the purpose of 
the experiment, but to consider how the drugs affect muscle as compared 
to body size may be of some interest. As we should expect, the ratio 
remains constant (within experimental error) for all times on “drug’’ 
D (control). Drugs shrink muscle proportionately more than body 
weight, as is again reasonable since presumably the weight of skeleton 
must remain unaffected. 

The individual (adjusted) treatment means and the time regressions 
are shown in Figure 1. Some supplementary data for 16 and 20 days 
after operation, later obtained from another paper (Solandt, et al. 
[1943]), are also shown in the figure. The graph suggests that the same 
linear model could be satisfactorily fitted for the extended period. 
Indeed, considering the well known risks of extrapolation, the regressions 
fitted to the earlier data alone continue to do remarkably well. The 
only serious discrepancies are the muscle weights for atropine at 16 days 
(A4) which show deviations .177 + .034 for V and .185 + .048 for U. 
But these observations were for the only 2 rats out of 25 on the heavy 
dose (additional to the 12 killed at days 4 to 12) who were able to 
survive so long. Hence a possible intepretation is that they may have 
been selected for resistance to the drug. Excluding U and V for these 
two rats, the mean square deviations from the extrapolated regressions, 
weighted by numbers of observations, were 


Character | No. of Deviations | Mean Square Ratio to s% 
(X 10°) 
Y a 1782 2.38 
V saad 5582 2.69 
U 1712 .40 


*Including A4. 

**Excluding A4. 

*Excluding A4 and also C4 for which published data are evidently a misprint copy of the o 
observations. 
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The mean square ratios for Y and V would be just about the 5 per cent 
significance level if they were honest F values. However, after taking 
into account error of the extrapolated regression estimates and correla- 
tions (both ignored in these estimates) the discrepancies would not be 
significant. Exact testing would require re-fitting the model to the 
entire data. 


COMPARISON OF EFFECTS ON INTACT AND DENERVATED MUSCLES 


Comparison of V and U was a prime purpose of the experiment. 
Before doing the experiment its authors seem to have hoped that one or 
more of the drugs might delay muscle atrophy. In fact they cause 
denervated muscle to shrink still faster than with no drug, but they 
have this effect also on intact muscle and the question has been asked 
whether or not their shrinking effect on the denervated muscle is as 
great as on intact muscle. This raises the supplementary question of 
what may be meant by “as great” since there must be a limit to the 
potential rate of shrinkage. Null drug xX denervation interaction 
here implies that comparisons between rates of shrinkage on intact 
and denervated muscles under different drugs do not vary by more 
than could be ascribed to the lower potentiality for further shrinkage 
in atrophying muscle. If we had a clear cut hypothesis as to how drug 
and denervation effects should superimpose if the drug has no effect 
on atrophy, testing would be straightforward. Since we have no such 
hypothesis we can only seek to map out from the observations what 
in fact happens. Interaction as defined would be indicated by a sig- 
nificant reversal of effects; that is, if for two drugs, 7 and j, the contrasts 
U;, — U;, and V;, — V,, were both significant and with opposite 
signs. Less stringently we may postulate that in absence of interaction 
the difference V;, — U;, , or equivalently U;, , should change smoothly 
and monotonically with intensity of drug effect. The latter may be 
measured by V;, , (Viz + U;,), or by body weight (Y;,). There being 
no reason to suppose that the relation should be linear, a curve may 
have to be considered before significant deviations can be claimed to 
indicate interaction. 

Whether an a priori hypothesis may be available for test, or the 
‘null’ interaction relation of U to V has to be inferred from the data, 
U and V must both be regarded as joint experimental variates. Neither 
formulation can treat V as a concomitant variable as in ordinary analysis 
of covariance. One might be tempted to test for a monotonic relation 
by studying deviations of U or of (V — U) from its regression on V or 
(V + U). But the following circumstances indicate that such procedure 
could be misleading. For simplicity of discussion suppose that the 


ANALYSIS OF COVARIANCE 123 


relation may be linear. Firstly, the structural relation specified by 
hypothesis should conform simultaneously to any of the forms. 
U=a+ (6) 
(7) 
U = [a+ BU + V))/(1 + 8) 
U — V = [2a — (1 — B)(U + V)J/(1 + 8). 
Using regressions, only (6) and (7) would be the same and they would 
not conform with the others. Secondly, deviations from regressions 
(6) and (7) would be identical, whereas the error variances to which 
one would be led to compare them might be very different, as they are 
in the present example. The tests for deviations to which each would 
lead cannot both be correct; the implication is that both would be 
wrong. 

No satisfactory general method for fitting a structural curve seems 
available. When groups can be demarcated independently of possible 
random errors the Nair-Shrivastava [1942] method of averages might 
serve. In the present example a curve will be no better than a straight 
line. Two or three forms of curve were tried but only increased the 
mean square deviation. Since reasonably reliable estimates of the 
error variances and covariance are available the Kummell line may be 
fitted. 

With correlated errors, the simplest procedure seems to be to make 
a linear transformation to variates with uncorrelated errors and equal 
error variance. The estimates of error variances after adjusting for 
X are, from Table 3 and equation (4), 


sy = 2071.4, 85 = 4260.7, Suv = 1812.0. 


The corresponding values for Z/, are proportional to these, the average 


values being given by multiplying by .099853 (Equation 5). Choosing 
V as one variate the other is 


2 
oyU SuvV 
1/2 


W=4 


“(coor — 
.76968V — .87987U 


when the above estimates are substituted for the o’s, and the negative 
sign is chosen. For brevity let deviations from the means of V/, and 
W3}, (or equivalently of the corresponding B/ , Table 4) be denoted 


Yo = Vie — 


= Wi, — 


: 
| 
“a 
| 
« 
hee 
= 
= 
| 
a 


124 BIOMETRICS, MARCH 1958 


which we assume to be uncorrelated with common _ variance 
.09985 s; = 206.8 for logarithms multiplied by 10°. The values of y are: 


Treatment Yo Yi 
D 132.58 22 .24 
B 52.16 26.81 
C —27.19 — 38.00 
A — 157.55 —11.05 


The angle 6 of the structural line with the y, axis is now estimated from 


4 2 > YriYri 
tan 28 = = = .33065; == 9°84’ 


li 


The estimated line is 


.1609y2 
U} — .1486 + .6919V? 


Let deviations from the line be denoted 


(8) 


hi = cos — sin 
and deviations from the origin along the line be 
= yr: 8inB + yo, +6 


where ¢ is a random variable with variance o”, estimated above to be 
206.8. On the null hypothesis that deviations from the line are only 
random error the expectation of >>2?, is (Smith [1956]) 


a(n — 2)(1 — w — 0@’)) 
where here n = 4 and w = o°/>.#. Estimating = 
no’ = 47176.1, w = .0048, the estimated variance of deviations from the 
line is 


= = 820.3. 


The variance ratio for deviations from the line divided by internal 
error variance is therefore 820.3/206.8 = 3.97. Assuming that the 
sampling distribution of the ratio may be approximately that of F 
with 2 and 35 degrees of freedom, significance slightly beyond the 5 per 
cent point is suggested. Since there seems to have been virtually no a 


g 
| 
‘| 
4 
‘| 
he 


ANALYSIS OF COVARIANCE 125 


priori knowledge to indicate what sort of result to expect from this 
experiment, the practical conclusion at this level of significance for a 
test statistic whose exact distribution is not known might be to reserve 
judgment. As already noted, after these computations were done some 
supplementary data were obtained. It can be seen from Figure 1 that 
if all data were used the estimates of U, for drug B and V, for drug C 
would each be increased. Both alterations would bring the points 
closer to the fitted structural line and deviations re-computed for all 
the data might not be significant. So far as present evidence goes 
the relation (8) seems to describe very well the relation between U and 
V after 9 days on variable doses of atropine or quinidine. Differential 
effect of drugs on denervated versus intact muscles is not yet demon- 
strated. 

The foregoing analysis is defective in using estimates of error vari- 
ances and covariance for scaling as if they were exactly known. However 
with over 30 d.f. for error such defect may not be serious and the pro- 
cedure seems substantially better than some alternatives which might 
be suggested. 

If a regression analysis were used the mean square deviation of 
U;, or of (U — V)!, from their regressions on V/, is 1081.3, leading to 
variance ratios relative to the respective error variances of 2.54 or 4.00. 
The latter ratio happens to be near that formerly obtained, but this 
depends on coincidental circumstances which could not be forecast. 
The reason is that the correlation of errors in U and V is such that the 
structural analysis minimizes a sum of squares of deviations which are 
not far from vertical on the scales of U and V and hence are similar to 
those minimized by the regression analysis, that the regression of 
(V — U), on JV, is not far from zero, and that the random errors of 
these two variates are almost uncorrelated. An unusual feature is 
that the estimate of the slope of the structural line for U/ on V1 , viz. 
.6919, is less than that of the regression which is .6983. This has 
happened because the random errors are highly correlated with even 
steeper regression, viz. .8748. 

An alternative approach to the structural analysis would be to 
note that the Kummell line corresponds to the first canonical variate 
and hence that deviations from it correspond to the second canonical 
variate. Presumably it would not be too difficult to modify a canonical 
analysis for covariance on a concomitant variable (cf. Cochran and 
Bliss [1948]) and to obtain an approximate x’ test for the second canoni- 
eal variate. The x° approximations for these multivariate tests, how- 
ever, suppose the error variances to be exactly known (Rao [1952] 
Sec. 9d), whereas the variance ratio here derived allows that the estimate 
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of error is also a random variable. The above analysis, furthermore, 
shows in elementary manner exactly what is being done, in contrast to 
the mystification which some workers find in canonical variate 
terminology. 

No matter what conclusions were reached from the given experiment, 
after discovery that drugs affected intact muscles, a further experiment 
would be needed to give a convincing answer to the main question if 
there seemed any chance that a drug might be beneficial. DeLury 
[1948] p. 167) notes that atrophy is irreversible whereas weight loss 
due to drugs is recoverable. It might happen, however, that drug 
shrinkage is not recoverable by denervated muscle; and since any 
reasonable definition of “beneficial” would seem to require that at 
some stage a drugged denervated muscle should be larger than an un- 
drugged one, conclusive evidence that a drug delays atrophy could 
only be given after withdrawing the drug and observing wu after v has 
recovered. If u failed to recover, the practical conclusion could only 
be that the drug had intensified atrophy even if its immediate effect 
was in some sense less than its effect on intact muscle. 


SUMMARY 


An analysis of covariance with several characters, already discussed 
some years ago in Biometrics, is re-examined. The point is first made 
that to apply standard routine analysis of covariance reduction when 
an “independent” variable is affected by treatments may be equivalent 
to estimating an artifact with little or no meaning. 

The example studied has one concomitant variable, « = body size, 
representing variability of the experimental material before treatments 
begin, and several experimental variates. The treatments are combi- 
nations of time and drugs. A method of analysis is proposed which 
first studies regressions on time for each drug and picks out a convenient 
function of time observations in which is concentrated all available 
information about contrasts between drugs. Associated responses of 
the dependent variates to drugs are then studied by means of a structural 
relation, that is, one relative to which the random errors of both variates 
receive due consideration. 

The important distinction is made between a regression associated 
with random variation within treatments, and regressions or structural 
relations among treatment means. The former is that used in ordinary 
analysis of covariance to remove some of the extraneous random varia- 
tion. The latter derive from treatment effects. Variation between 
quantitative treatments (in this example, time) can be summarized by 
regression of the experimental variates on the independent variable 
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defined by the treatments. Joint variation of two experimental variates 
between qualitative treatments must be described by a structural re- 
lation. In the example there arose the question whether such variation 
of two characters could be supposed to be functionally related, the 
contrary condition being that a given change in one variate may be 
associated with variable amounts of change in the other, depending 
on the quality of causative treatment. It is demonstrated that to 
evaluate this in terms of a regression analysis, instead of by a structural 
analysis, may be seriously misleading. 
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QUERIES AND NOTES 
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129 NOTE: ITERATIVE SOLUTIONS OF LIKELIHOOD 
EQUATIONS 


ALAN STUART 
Research Techniques Unit, London School of Economics, London, England 


1. H. W. Norton [1956] has pointed out that we cannot safely assume 
that a single iteration is sufficient when obtaining the value of a maxi- 
mum likelihood estimator from that of another consistent statistic, 
taken as a trial value. This is quite unexceptionable—unless we are 
fortunate enough to find that the result of the first iteration is to make 
no change at all in the trial value, we clearly need at least two iterations 
to be sure that the values obtained are converging. The purpose of 
this note is to point out that the rapidity of convergence of the suc- 
cessive approximations is hastened by a slight modification of the 
technique used by Norton. To demonstrate the point, we review the 
basis of the iterative method. 

2. The likelihood equation for the estimation of @, 


_ d log L(x| 6) 


is expanded in a Taylor series about the trial value 6 = t. To the first 
order of approximation, this gives 


where 6* is some value in the interval (t, 6), 6 being the exact value of 
the maximum likelihood estimator satisfying (1). 
Re-arranged, (2) gives 


d log L(x| @) log 8) 
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(1) 
dé 


2) 


of 
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For large n, 6*, which is bracketed by two consistent estimators, 
converges in probability to the true value, say 6, . Thus the denomi- 
nator of the second term on the right of (3) converges in probability to 


E log L(z| 
de 


and this itself converges to 


Vv 


de’ ar 6 


asymptotically. 
Thus we may write approximately 


log el) | (4) 


corresponding to the first process of convergence mentioned above, 


or alternatively 
d log L(x | log L(x} 8) ] 


corresponding to the second. 

3. It was in the form (5) that Fisher [1925] introduced the iterative 
method of solution, and it is in this form that it is given by well-known 
texts (such as those by C. R. Rao and by M. G. Kendall). Norton, 
however, uses the form (4). 

This makes no difference to the ultimate value obtained for 6, 
since the forms (4) and (5) are both valid expansions, but tends to 
make the iterative process longer. To illustrate this, consider the 
first of Norton’s iterations. We have t = 0.0570, and calculate, for the 
data concerned, (see Fisher [1950], cited by Norton): 


d log L _ 1997 __1810 
dé = 0.9330 0.0570 387-1718. 


Now in this example 


d’ log L _ a (b +0) d 
and 
d’ log L n(1 + 26) 
de | var @ 26(1 — 6)(2 + @) (7) 
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Inserting the numerical values of the data, (6) gives 


log _ _ 1997 
2.0570" 


(8) 
1810, _ 1 
0.9430" 0.05707 .00008241985 ’ 
while (7) gives 
log 
[2( de’ @=0.0570 
(9) 


_ 3839 X 1.114 = 1 
~ 2X 0.057 X 0.943 & 2.057 .00005170678 


Substituting, we now find that (4) becomes 
6 = 0.0570 — 387.1713 & .00008241985 = 0.0570 — 0.0319 = 0.0251, 


which is Norton’s value, apart from a disagreement in the last figure. 
On the other hand, (5) becomes 


6 = 0.0570 — 387.1713 & .00005170678 = 0.0570 — 0.0200 = 0.0370. 


This is very much closer to the accurate value (0.0357) than the estimate 
(0.0251) obtained from (4), the adjustment by (5) being very good, 
while that from (4) seriously overshoots the mark and leads to the 
multiplication of iterations. 

A second iteration by (5) gives the estimate 6 = 0.0352, very close 
indeed to the accurate value. Use of (4) requires three iterations to 
reach a value (0.0355) as close as this, the second iteration giving 0.0330. 
4. One would suppose that (5) will generally give better adjustment 
from an inefficient estimator, since it introduces an averaging process 
which does not appear in (4). However that may be, it certainly pro- 
duces better results in the case which Norton examined. 
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ABSTRACTS 


Papers presented at the tenth annual meeting of the Biometric Society (E.N.A.R.) in 
joint session with the American Statistical Association and the Institute of Mathematical 
Statistics in Atlantic City, September 10-13, 1957. In order to avoid duplication some 
of the abstracts from these sessions will be published in the Journal of the American 
Statistical Association. The titles for those abstracts are listed at the end of this section. 


CHESTER ALEXANDER (Westminster College, Fulton, Mis- 
souri). Longevity as a Family Trait. 


Considerable interest has been aroused during recent years over the 
fact that human longevity has been increasing at the rate of about four 
months every year. Such interest has crystallized in the expression of 
many opinions and in occasional articles for learned magazines and 
journals. There are three major areas of influence within which, or in 
combination, causes may be sought. 

One of these areas is that of biological inheritance, or that longevity 
is a heritable trait. If that be the explanation then there should be 
certain similarities between the length of lives of the kinship generations. 
This could be tested by gathering enough data to be able to run correla- 
tions, and compare offspring with their ancestors. 

In this study of longevity as a family trait data have been assembled 
from numerous genealogical sources in libraries, genealogical societies, 
and in public records. At the time of this writing calculations covering 
over 20,000 persons have been processed, and several thousand others 
are being dealt with at the present time. 

The whole problem has been divided into sections such as father-son 
pairs, father-daughter, mother-son, mother-daughter, grandmother- 
grandson, grandmother-granddaughter, sibling-pairs, and _ several 
unrelated test groups. Each of these groups of which there are 26 
different combinations has been divided into samples, and samples 
contain from 500 to over 1000 cases. In treating the problem in this 
manner sampling variations will be discovered. There are as many as 
ten samples for each problem, and there are 25 major problems in the 
present study. 

In addition to these one or two generation comparisons data are 
available which trace family kinship over as many as ten generations, 
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and a few which run back for at least 1000 years. All of the tabulation 
and analysis have been done with the use of calculating machines, and 
by people who have had practical training in statistical studies. 

Other aspects of the study include mean ages of husbands and wives, 
and these compared with their children, longevity of children according 
to the order of birth, number of children compared with the longevity 
of their mothers, length of actual child-bearing period compared with 
mother longevity, and number of children per family compared with 
their longevity. 

The father-son correlations run as follows: .07, .06, .00, .02, and so on. 
Fathers and daughters have similar r’s: .009, .08, etc.; mothers and 
sons, .10, .03, .03, .10, .11, and so on. Mothers-daughters do not rise 
above .08. Grandfathers-grandsons relate .00, .00, .12, ...; grand- 
fathers-granddaughters: .09, .09, .07, .05, .12, ... (several thousand 
cases in each sampling). Means of parent ages compared with their 
children are .08, .06, —.01, .19, .06, .09, ete. 

To date the factors derived by statistical calculations do not support 
the hypothesis that longevity is a distinct genetic trait. It still may be 
found to be so, but the thousands of pairs analyzed fail to support the 
belief. The study is not complete as more thousands of cases are being 


woven into the whole study, and non-related groups are being used for 
comparison. 


PETER ARMITAGE (London School of Hygiene, London, 
466 England, and National Cancer Institute, Bethesda, Maryland). 
Sequential Procedures for Medical Trials. 


The place of sequential methods in clinical trials is briefly reviewed. 
In the past, consideration has been given particularly to the situation 
in which the measurement of response for any patient is made fairly 
soon after treatment. It may be ethically desirable to perform some 
sort of sequential analysis in trials involving long follow-up periods, 
e.g. where the criterion is survival time after treatment. Any sequential 
analysis would have to be supplemented by a later analysis involving 
complete follow-up of all patients. Two methods of sequential analysis, 
non-efficient but suitable as a basis for stopping rules, are considered. 
The more useful appears to be a paired comparison analogous to the 
sign test. 


EARL L. ATWOOD (U.S. Fish and Wildlife Service, Washington, 
467 D.C.). A Procedure for Removing the Effect of Response Bias 
Errors from Waterfowl Hunter Questionnaire Responses. 


Response bias errors are studied by comparing questionnaire 
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responses from waterfowl hunters using four large public hunting areas 
with actual hunting data from these areas during two hunting seasons. 
To the extent that the data permit, the sources of the error in the re- 
sponses were studied and the contribution of each type to the total error 
was measured. Response bias errors, including both prestige and 
memory bias, were found to be very large as compared to non-response 
and sampling errors. Good fits were obtained with the seasonal kill 
distribution of the actual hunting data and the negative binomial 
distribution and a good fit was obtained with the distribution of total 
season hunting activity and the semi-logarithmic curve. A comparison 
of the actual seasonal distributions with the questionnaire response 
distributions revealed that the prestige and memory bias errors are 
both positive. The comparisons also revealed the tendency for memory 
bias errors to occur at digit frequencies divisible by five and for prestige 
bias errors to occur at frequencies which are multiples of the legal daily 
bag limit. A graphical adjustment of the response distributions was 
carried out by developing a smooth curve from those frequency classes 
not included in the predictable biased frequency classes referred to 
above. Group averages were used in constructing the curve, as sug- 
gested by Ezekiel [1950]. The efficiency of the technique described for 
reducing response bias errors in hunter questionnaire responses on 
seasonal waterfowl kill is high in large samples. The graphical method 
is not as efficient in removing response bias errors in hunter questionnaire 
responses on seasonal hunting activity where an average of 60 percent 
was removed. 


C. L. CHIANG (University of California, Berkeley, California). 
468 An Application of Stochastic Processes to Life Tables and the 
Standard Error of Age Adjusted Rates. 


In this paper the conventional cohort and current life tables are 
approached from the viewpoint of stochastic processes in which all the 
biometric functions are treated as random variables. The probability 
generating function of the joint distribution of survivors in the cohort 
life table is derived. Variances and covariances of the random variables 
are given. The variances of life expectancy in the two life tables are 
not identical and they are equal only in a particular case. 

Methods of estimating life expectancy and the standard error of the 
estimates are presented for follow-up studies where mortality information 
is incomplete. For example, J, patients are followed for y years with L, 
of them surviving the period. If the force of mortality is assumed to be 
constant after y — 1 years of follow-up, the life expectancy may be 
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estimated as 


z=1 y-1 
Application of the methods is made to a follow-up study on patients 
having angina pectoris in studying the difference in life expectancy 
between male and female patients. 
Discussion is also made on the standard error of commonly used age 
adjusted rates. 


W. T. FEDERER (Cornell University, Ithaca, New York). 


- Augmented Designs. 


An augmented design is a standard experimental design with addi- 
tional treatments included in some of the blocks, incomplete blocks, 
rows and columns, or other stratification groupings. Two kinds of 
treatments, or entries, appear in the design; the first kind, standards or 
standard treatments, are those treatments appearing in the standard 
design, and the second kind, the new or augmented treatments, are 
those treatments not occurring in the standard design but occurring 
in the augmented design. The amount of replication is usually less on 
the new treatments than on the standards. The construction of aug- 
mented designs presents little difficulty. The analysis for this new class 
of experimental designs has been developed in detail for augmented 
designs with one-way elimination of heterogeneity and for augmented 
designs with two-way elimination of heterogeneity, and the analysis 
has been indicated for augmented designs with three-way and higher-way 
elimination of heterogeneity. Specific examples including the augmented 
randomized complete block design, the augmented balanced incomplete 
block design, the augmented latin square design, the augmented in- 
complete latin square design, and the augmented magic latin square 
design have been considered in the paper. Numerical examples are 
included. 


JOHN L. FULLER (R. B. Jackson Memorial Laboratory, Bar 


vai Harbor, Maine). Quantitative Genetics of Behavioral Attributes. 


The heritability of behavioral attributes has often been demonstrated 
by selection experiments and comparisons between pure lines. Usually 
it is concluded that differences in these characters depend upon numerous 
genes. However, many authors have attempted to explain their 
results on the basis of a few genes. This paper describes attempts to 
fit data on activity, learning, and emotionality in animals to various 
genetic models. Problems of scaling are often encountered, since 
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behavior does not have natural units of measurement such as grams or 
centimeters. Interaction between genes also leads to complexity. 
Hybrids may fall outside the parental range. In spite of these difficulties 
it is concluded that quantitative genetics models can be applied to 
metrical behavioral characters just as they can be used to analyze the 
inheritance of structure. 


SAMUEL W. GREENHOUSE (National Institute of Mental 

471 Health, Bethesda, Maryland). Some Statistical and Methodo- 
logical Aspects in the Clinical Evaluation of the Tranquilizers in 
Mental Illness. 


A review is made of 25 controlled studies evaluating the use of 
tranquilizers in mental illness. Two drugs were tried almost exclusively 
in these studies: chlorpromazine and reserpine. 

In eight studies, chlorpromazine produced an average “improve- 
ment” rate of 56%. This same drug also produced an average improve- 
ment rate of 25% in control groups given a placebo, illustrating once 
again the relatively large response due to placebo. The corresponding 
average rate of improvement shown by reserpine in eleven studies were 
54% and 18% respectively. Five of the eight studies using chlorpro- 
mazine showed a response in the treated group significantly higher than 
in the corresponding placebo group and six of the eleven studies utilizing 
reserpine showed a significantly higher favorable response in the drug 
group over the placebo group. In the chlorpromazine studies, the 
hypothesis of homogeneity is tenable among the eight placebo control 
groups. In the reserpine studies homogeneity is rejected for both the 
control and treated groups. 

It is emphasized that improvement does not mean remission of 
disease but favorable changes in behavior and symptoms secondary to 
the disease. Methodologically, difficulties arise in maintaining the 
double blind procedure and in isolating various effects of the experiment 
itself on environment and hospital staff. 


W. D. HANSON (Biometrical Services, ARS, United States 

472 Department of Agriculture, Beltsville, Maryland). The Theo- 
retical Distribution of Lengths of Undisturbed Chromosome 
Segments in F, Gametes. 


A problem recognized by plant breeders is the restriction in recom- 
bination resulting from linkage, the tendency of genes located on the 
same chromosome to be transmitted as a unit or block in inheritance. 
The development of the distribution characterizing the gene blocks in 


‘ 
4 
, 
r 
ao 
1 
ir 
O 


136 BIOMETRICS, MARCH 1958 


the gametes of an F, individual was given. Segment lengths were 
considered in units of centimorgan/100; thus, an x = .01 would indicate 
a chromosome length such that the probability is .01 that a gamete 
carrying a genetic recombination in this segment would occur. 

The density distribution for a segment, x, — x, , depends on two 
independent joint probabilities, dP = KP(x)f(x, , x.) dx, dx, , where 
P(x) is the probability that a crossover will not occur in the segment, 
(1 — x, + 2,), and f(x, , x2) dx, dx, is the joint density distribution for 
2,2, . With no interference x, and x, are dependent in a statistical 
sense and are characterized by rectangular distributions. Since inter- 
ference exists and is a function of (x, — 2x.), f(a, , 22) was taken to be 
(x, — 2)*. The comparison of limiting cases suggested the use of 
a = 3} for the “average” distribution. The density distribution for 
segments (S;) within the set defined by the intersection, (\?_, S; = 2, 
where x is the locus of A, a factor was developed. The expected lengths 
of gene blocks were calculated for different cases and the results were 
interpreted. 


EUGENE Kk. HARRIS (Robert A. Taft Sanitary Engineering 
473 Center, Public Health Service, Cincinnati, Ohio). On the Prob- 
ability of Survival of Bacteria in Sea Water. 


These experiments consist of sets of replicate survival runs of 
Salmonella and coliform crganisms added to sea water at varying water 
temperatures and salinities. The observations are estimates of bacterial 
densities 6, at fixed sampling times. ‘Most Probable Number” (MPN) 
estimates, based on the number of sterile tubes in a dilution series, 
are used. 

To represent observed variability in replicate MPNs at each sampling 
time, a Gamma distribution of 6, is postulated and compounded with 
the binomial distribution of sterile tubes. Using a general formula for 
the factorial moments of such compound binomials, simple moment 
estimates of the parameters are obtained. These estimates are shown 
to be at least 95 per cent as efficient as maximum likelihood estimates 
over a wide range of values of the parameters. 

The fitted compound distributions show good agreement with 
observed distributions of sterile tubes. Possible application of these 
results to problems in environmental sanitation are discussed. 


H. LEON HARTER AND MARY D. LUM (Wright Air Develop- 
474 ment Center, Wright-Patterson Air Force Base, Ohio). A Note 
on Tukey’s One Degree of Freedom for Non-Additivity. 


John W. Tukey has proposed (Biometrics 5, 232-242, 1949) a test for 
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[removable] non-additivity in a two-factor experiment. This test can 
easily be extended to experiments involving three or more factors. A 
sum of squares (with a single degree of freedom) for non-additivity is 
separated from the sum of squares for interaction, and tested against 
the residual. The authors are not aware that any adequate interpreta- 
tion of this sum of squares has ever been given. The interpretation is 
as follows: The sum of squares for non-additivity is the sum of squares 
for linear regression of interaction on the product of the main effects. 
An equivalent interpretation: The ratio of the sum of squares for non- 
additivity to the sum of squares for interaction is the coefficient of 
determination (the square of the coefficient of correlation) between the 
interaction and the product of the main effects. The sign of the corre- 
lation coefficient indicates the type of transformation required to 
minimize non-additivity. Let the required transformation be 2’; 
then a positive correlation coefficient indicates p < 1, while a negative 
correlation coefficient indicates p > 1. Several artificial examples are 
given. 


WALTER R. HARVEY (Biometrical Services, Agricultural 
Research Service, United States Department of Agriculture, 
Beltsville, Maryland). Estimation of Genetic Progress in Each 
of n Traits. 


475 


Heritability (h”) is the ratio of the additively genetic or “genic’’ 
variance (o3) to the phenotypic variance (03). It can easily be shown 
to be the regression coefficient of breeding value on performance. Hence, 
the equation for estimating the expected genetic progress, in the simplest 
case of selection being applied for only one trait, is 


AG = h’s 


where AG is the expected progress and s is the selection differential, 
both in standard measure. 

The expected genic progress in P, (a second trait) when selection 
has been entirely on P, is simply 


AG2 = 


in standard measure, where r¢,¢, is the genetic correlation between the 
two traits. 

When selection has been applied to each of n traits simultaneously 
the expected genetic progress for each trait is given by the following 
set of equations: 


= 
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AG, = + + + + 
AG, + + + + 


AG, + + + + 


If the intensity of selection has been equal for each of the n traits the 
b’s for the above equations are obtained from the n solutions to the 
following sets of equations: 


pape bis Dos Das 


ST papi paps L Din Don 


Hence, it can readily be seen that failure to consider both the phenotypic 
and genetic correlations among traits may often lead to seriously biased 
estimates of expected genetic gain from selection. 


F. M. HEMPHILL (Department of Public Health Statistics, 
476 University of Michigan, Ann Arbor, Michigan). Retrolental 
Fibroplasia—A Clinical Trial. 


The cooperative Study of Retrolenta Fibroplasia and the Use of 
Oxygen was a simple and straightforward example of a clinical trial 
between two protocols of therapy. Premature infants, weighing less 
than 1,500 grams at birth, were assigned to routine and curtailed oxygen 
therapies in a ratio of 1:2. This randomization was made within 
hospitals and within each of three birth-weight groupings. Suspected 
ethical consequences were paramount in selecting the design of study. 
Premature infants from eighteen nurseries surviving forty-eight hours 
were admitted to this study. Prior research was beneficial in planning. 
The hypothesis of no difference between the protocols with regard to 
mortality was accepted but a similar hypothesis for retrolenta! fibro- 
plasia was dramatically and conclusively refuted. In light of these 
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findings, action by clinicians to restrict the use of oxygen appears to 
have resulted in the practical elimination of the greatest cause of 
blindness in premature infants over the last decade. 


JOSEPH G. HOFFMAN (Roswell Park Memorial Institute, 
477 Buffalo, New York). The Role of Variable Generation Time in 
Tissue Cell Multiplication. 


Variable generation time may be an expression of phenotypic 
variation as a result of mutation processes and it may also arise from 
time sequences of molecular events in individual cells. In certain 
instances a spread in generation time has been induced by ionizing 
radiations which act as perturbing agents. Whatever its origin, a 
variable generation time has profound effects on the behavior of tissue 
cell populations and the methods of measuring their growth in terms 
of the time parameters of the single cell’s life cycle. Analytic methods 
for determining the mitotic ratio in a free birth (no death) process, 
such as may exist in transplantable mouse tumors, will be described 
and compared with digital computer runs. The computer and the 
Monte Carlo Method provide an effective means for analysing growth 
measurements on a tumor. Results to date show that: (1) the mitotic 
ratio is not always a simple function of the time parameters of the cells’ 
life; (2) the mitotic ratio stabilizes slowly; (3) the distribution of popula- 
tion sizes is sensitive to the distribution of generation times. The 
implications of these results for experimental tumors, and for normal 
tissue cell sequences of differentiation will be discussed. 


H. F. HUDDLESTON (Agricultural Estimates Division, AMS, 
478 United States Department of Agriculture, Washington, D. C.). 
The Use of Plant Characteristics in Objective Yield Forecasting. 


Studies on the use of objective plant data for forecasting yields and 
estimating yields at harvest time have been in progress for three years. 
Most of the effort has been devoted to cotton, corn, and wheat but 
similar work has been started on soybeans. 

Results indicate that the relationship of crop yields to observed 
plant characteristics follows similar patterns for most crops. Early 
in the season fruit already present can be counted on sample plots. 
Additional fruit still to be formed can be forecast from the amount 
already present and the stage of maturity attained by the average plant 
on that date. Ultimate size of mature fruit appears to be related to 
fruiting rate. This information, together with an allowance for normal 
fruit mortality and harvesting losses, provides a basis for forecasting 
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yields. As the crop approaches maturity, yield forecasts and estimates 
of final yield are derived from observations on mature fruit with an 
allowance for harvesting loss. 


OSCAR KEMPTHORNE AND C. M. VON KROSIGK (Iowa 
479 State College, Ames, Iowa). The Estimation of Environmental 
and Genetic Trends from Records Subject to Culling. 


It has been pointed out by Henderson and Lush that the standard 
procedure of least squares with an additive model leads to biassed 
estimates when the presence of observations in some cells is conditioned 
by the.magnitude of observations in other cells. A common example is 
that of milk records on cows in different years, for which one might be 
tempted, in the absence of the previously given fact, to apply the linear 
model 

ute; +a; +e; 

where 

y;; is the record of the 7-th cow in the j-th year presumed to be 

adjusted for age by some external formula, 

c, is the effect of the 7-th cow, 

a; is the effect of the j-th year 
and e¢;;’s are the error assumed, at least, to be uncorrelated with the 

mean zero variance o°. 
The origin of the bias in routine least squares procedure for a two-way 
classification with unequal numbers is discussed from another point of 
view. This leads to a procedure for unbiassed estimation of year effects 
under a particular model. The model assumes the c; to be independent 
normal variates with mean zero and variance o2 and the e;; would be, 
without culling, independent normal variates with mean zero and 
variance o° and independent of the c’s. The error of the estimate is 
obtained. The procedure for the simple case suggests a straightforward 
procedure for the general situation. The general model contains param- 
eters for year effects and an effect for each group of individuals entering 
the herd. The fitting of the general model by maximum likelihood is 
examined briefly. Some aspects of the model are illustrated and exam- 
ined with a set of actual data. 


G. M. KUZNETS (University of California, Berkeley, California). 
Objective Forecasts of Fruit Production. 


Extensive experimentation has been carried on in California during 
the last five years with forecasts of fruit production of the form Z = XY, 
where X is a random variable which measures the change in set (number 
of fruit which will be available for harvest) from base to forecast year 
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and is itself a ratio of two random variables and where Y measures the 
change in size of fruit and is some function also of a ratio of two random 
variables. Change in set has been estimated using counts of fruit on 
whole trees, on sample branches, on identical tagged branches, and 
within square frames on identical trees in two years. Estimates of 
change in size of fruit have been based on measurements of various 
diameters of individual fruit and on average weight of fruit. Various» 
attempts have been made to estimate the relative contributions to 
total error of sampling and forecast errors for this class of forecasts. 


Some results will also be reported on the problem of optimum combina- 
tion of forecasts. 


H. D. LANDAHL (University of Chicago, Chicago, Illinois). 
481 Some Theoretical Considerations of Potentiation in Drug Inter- 
actions. 


In order to account for some of the more important aspects of drug 
interaction it is necessary to consider a model which can also account 
for certain general properties of the action of a single drug. Hence a 
simple model, in which there may be enzymatic detoxification of a drug, 
is first studied theoretically. It is found that the relation between the 
time for appearance of an effect due to the drug and the size of the dose 
contains the same parameters as the relation between the effectiveness 
of paired doses and the interval of time between doses. A similar 
situation holds when the drug is given at a constant rate. 

When two drugs are given together, their effect will depend on how 
they interact, how much of each drug is given, which is given first and 
on the interval of time between injections. A number of plausible types 
of interaction are considered theoretically in terms of the model, analyt- 
ical expressions being given for a number of cases. The potentiation 
may be positive or negative. In the former case the potentiation may 
be more than or less than additive depending on the order of delivery 
and the time between injections. Methods for the determination of 
the parameters are discussed in relation to some available data. 


HAROLD NISSELSON AND THEODORE D. WOOLSEY (Sta- 
tistical Research Division, Bureau of the Census, Washington, 

482 D. C.; and National Health Survey Program, Division of Public 
Health Methods, Public Health Service, Washington, D. C.). 
Some Problems of the Household Interview Design for the 
National Health Survey. 


This paper deals with certain problems inherent in the measurement 
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of morbidity by means of interview surveys. These are problems 
regarding which decisions had to be made in planning the household 
survey for the National Health Survey Program. The decisions were 
based as far as possible upon evidence from earlier surveys and data 
from a pretest conducted in the standard metropolitan area of Charlotte, 
North Carolina. 

The paper presents data from earlier surveys and from the Charlotte 
Pretest and shows how the decisions came to be made on: (1) rules for an 
acceptable respondent in the household; (2) the treatment of error and 
bias associated with the interviewer; (3) the use of certain types of 
“probes” in the interview; (4) the problem of recall of illness events in a 
specified period of time; and other questions. 

In all of the planning the principle followed was to seek more objec- 
tive operational definitions, even sometimes at the cost of some arbi- 
trariness, and also to go to extra expense, when necessary, in the 
organization of the interview and field work in order to reduce response 
error and bias. 


ROBERT D. PARR AND LYLE D. CALVIN (Oregon Crop 

483 Reporting Service and Experiment Station, Oregon State College, 
Corvallis, Oregon). Forecasting Filbert Production from Counts 
of Nuts Set and Weights of Nuts. 


In an attempt to improve the mid-season forecast of filbert produc- 
tion, objective measure techniques were started in 1955 on an experi- 
mental basis. Following a survey of the entire producing area in the 
Pacific Northwest, a probability sample of trees from 300 orchards 
was selected. Counts of nuts, defects, and blanks and nut weights 
were taken from the sample trees in July of each year to provide data 
for a forecast of the fall harvest. Different types of forecasts and their 
errors are presented for the 1955-1956 seasons. Several of these fore- 
casts met the objective of a standard error less than five per cent of 
actual production. Results to date indicate that forecasts of size and 
grade may also be practical since the nuts reach their maximum size 
by late July with no further appreciable change in green weight. Non- 
sampling errors are discussed. 


D. 8. ROBSON (Cornell University, Ithaca, New York). Appli- 
484 cations of Multivariate Polykays to Genetic Covariance Com- 
ponent Analysis. 


Experimental variances and covariances obtained in quantitative 
inheritance studies are generally not statistically independent. For 
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example, the F, variance and the F, , F; covariance derived from a 
selfing series are dependent statistics. Efficient use of these statistics 
in constructing estimates of genetic variance components and the valid 
estimation of error require that the variances and covariances of the 
experimental variances and covariances be known and estimable. The 
dispersion matrix of the experimental variances and covariances, 


= : > (tia — — 


St 


is easily constructed from the general formula 
Cov (8;; 8:1) = + Cov — Ex,)(a; — Ex;), (a, — Ex,)(x, — Ex,)} 


+ {Cov(x; , x.) Cov(a; , x.) + Cov(x; , Cov(a; , x)}. 


The estimation of Cov (s;; , 5,2) is facilitated by use of the multivariate 
extension of Tukey’s polykays and symmetric means. Thus, in terms 
of multivariate polykays, 


Cov (6:3 , = {(1111)] 


1 
m—1 


+ {[(1010)(0101)] + [(1001)(0110)]} 
and is estimated without bias by the same function of the sample 
multivariate polykays. 


EVAN J. WILLIAMS (North Carolina State College, Raleigh, 
485 North Carolina). Prediction of Clinical Effects of Rh Incom- 
patibility. 


In the prognosis of haemolytic disease of the newborn, the maternal 
Rh titre has been found to be an important but not infallible guide. 
While in general a high maternal titre indicates that the infant will be 
severely affected, this is not always so, even when the foetus is Rh 
positive. In this study an attempt has been made to determine criteria 
based on Rh titres measured by different techniques and at different 
stages of pregnancy, which shall enable severity to be predicted with 
minimum probability of error. 

Rh titres determined by two different methods were considered: 
the albumin titre and the indirect Coombs titre. The recorded figures 
were the numbers of twofold dilutions of the original suspension which 
were required to reduce the response to ‘weak’; that is, the logarithms 
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(to base 2) of the actual Rh titres. It was found simplest to relate 
severity of effect to these logarithmic values. 

For a random sample of Rh negative mothers attending the Women’s 
Hospital, Melbourne, albumin titres and indirect Coombs titres were 
determined in the early (10 to 20 weeks) and late (37 to 38 weeks) 
stages of pregnancy. The respective logarithmic values are denoted 
A,,C;,,A2,andC,. The infants were classified at birth into six 
classes: 


Rh negative Rh positive—Unaffected 
Mildly affected 
Moderately affected 
Severely affected 
Stillborn. 


Discriminant functions based on the Rh titres were determined to 
distinguish between the different groups. It was found that a linear 
combination of C, and C, was a satisfactory predictor and that the 
albumin titres were less effective predictors, whose inclusion did not 
significantly improve the prediction. The combination of C, and C, 
may be regarded as a measure of the initial level of maternal Rh anti- 
body, and of the change in antibody level during pregnancy, each of 
which can be given an immunological interpretation. 

The results presented here are tentative, being based on a sample 
of only 50, but confirmatory work is in progress. 


Sampling problems 


In order to determine the probabilities of different kinds of errors 
accurately, it is necessary to know the proportion of each severity 
class in the population of births. Some consideration is given to sequen- 
tial sampling methods which shall ensure that the class means and the 
proportions in each class are estimated with sufficient accuracy. 


JOHN H. WILLIAMS, JR. (Veterans Administration, Washing- 


486 ton, D. C.). The Joint Action of Drugs in Tuberculosis. 


This paper briefly discusses the development of multiple drug 
therapy in tuberculosis with emphasis on the problems of drug resist- 
ance, the place of secondary or comparative non effective drugs, and 
on the importance of frequency of administration. The problem of 
determining optional regimens for different types of patients and the 
problem of evaluating the results of drug trials so as to be of greatest 
use to the clinician are also discussed. 
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Abstracts of the following papers will be published in the Journal of the 
American Statistical Association: 


J. STUART HUNTER (American Cyanamid Company, New York, 
New York). Some Considerations Arising in the Exploration and 
Exploitation of Response Surfaces. 


ROBERT M. DEBAUN (American Cyanamid Company, Stamford, 
Connecticut). Use of Response Surface Techniques. 


R. L. ANDERSON (North Carolina State College, Raleigh, North 
Carolina). Fractional Factorials and Confounding. 


J. E. WALSH (Lockheed Aircraft Corporation, Burbank, California). 
Approximate Distribution—Free Properties of ¢ + ks Tolerance 
Interval. 


S. C. SAUNDERS (Boeing Airplane Company, Seattle, Washington). 
Sequential Distribution—Free Tolerance Limits. 


E. A. THOMAS (Avco Research and Advanced Development, Win- 
chester, Massachusetts). Estimation of Some Tolerance Limits 
from Censored Data. 


ALFRED LIEBERMAN (Department of the Navy, Washington, 
D. C.). Determination of Tolerance Limits Through Experimen- 
tation. 


HAROLD GULLIKSEN (Educational Testing Service and Princeton 
University, Princeton, New Jersey). Problems of Metric. 


FREDERICK M. LORD (Educational Testing Service, Princeton, 
New Jersey). Problems Arising from Errors of Measurement. 


JOHN D. HROMI (United States Steel Corporation, Monroeville, 
Pennsylvania). Fractional Factorial Experiments. 


: MAVIS CARROLL (General Foods Corporation, Tarrytown, New 
York). Application of Fractional Replications in the Food Industry. 


a CUTHBERT DANIEL (Private Consultant, New York, New York). 
eo Industrial Experience with Fractional Replications. 


GEORGE V. MANN (Wellesley Hills, Massachusetts). The Evidence 
Relating Diet to Coronary Heart Disease. 


st MORTON ROBINS (Veterans Administration, Washington, D. C.). 
The Problem of Cause and Effect in Heart Disease. 
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EDWARD A. LEW (Metropolitan Life Insurance Company, New York, 
New York). Variations in Mortality from Heart Disease. 


MORTON ROBINS AND ALFRED M. STEINMAN (Veterans 


Administration, Washington, D. C.). Ecology of Coronary Heart 
Disease. 


EUGENE LEVINE AND BENJAMIN BUCHBINDER (United 
States Public Health Service, Washington, D. C.). Evaluating 
Influences on Satisfaction with Patient Care in Hospitals Through 
Multiple Regression Techniques. 


C. H. COOMBS (University of Michigan, Ann Arbor, Michigan). 
Inconsistency of Preferential Choice, as a Measure of Psycho- 
logical Utility... 


H. O. HARTLEY (Iowa State College, Ames, Iowa). Maximum 
Likelihood Estimation from Incomplete Data. 
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REGIONAL 
British Region 
At a meeting held on 21st November, 1957, the following papers 
were presented : 
O. M. Lidwell—Assessing cross-infection in upper respiratory 
disease 
C. C. Spicer—Poliomyelitis and the weather. 


Région Francaise 


A une réunion tenue le 20 november, 1957, Mile. J. Ulmo a présenté 
une communication entitulée “Quelques problémes de jugement sur 
échantillons relatifs 4 la régression.”’ 


CHANGES IN MEMBERSHIP 
(August 1957—-February 1958) 


Changes of Address 


Mr. Ross W. Adams, 500 East 33rd Street, Apartment 511, Chicago 16, 
Illinois, U.S.A. 

Dr. F. 8. G. A. Alberoni, Istituto di Medicina Legale e delle, Assi- 
curazioni dell’Universita, Pavia, Italy 

Prof. Dr. N. Atanasiu, Laberstr. 11, b. Hausler, Giessen, Germany 

Dr. Donald W. Bailey, Department of Zoology, University of Kansas, 
Lawrence, Kansas, U.S.A. 

Mr. Rainald K. Bauer, Jungfernweg 34, Deutsche Edelstahlwerke, 
Krefeld, Germany 

Dr. Carl Allen Bennett, 3115 West Canal Drive, Kennewick, Wash- 
ington, U.S.A. 

Mr. Colin R. Blyth, Statistics Laboratory, Stanford, California, U.S.A. 

Mr. P. R. Booth, Draft Foods Ltd., Kirkby Trading Estate, Kirkby, 
Lancashire, England 
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Mr. Attilio Bosticco, Istituto di Zootecnia Generale, Borgo Carissimi 10, 
Parma, Italy 

Prof. Francesco Brambilla, Ist. di Statistica dell’ Universita, Via Bertani 
1, Genova, Italy 

Dr. Luigi Cavalli-Sforza, Via Fatebenesorelle 18, Milano, Italy 

Dr. Ruggero P. N. Cepellini, Viale Bianco di Savoia 26, Milano, Italy 

Prof. W. G. Cochran, Department of Statistics, Harvard University, 
Cambridge, Massachusetts, U.S.A. 

Mgj. Hilton H. Earle, Jr., P. O. Box 543, Edgewood, Maryland, U.S.A. 

Prof. Benjamin Epstein, Dept. of Statistics, Stanford University, 
Stanford, California, U.S.A. 

Mr. Morris D. Finkner, New Mexico Agr. Expt. Station, Biometrics, 
Box 128, State College, New Mexico, U.S.A. 

Dr. N. R. Fraser, Ley Road, St. James, Cape Town, South Africa 

Dr. Vittorio Gallo, Via Vittorio Veneto 6 bis, Mortara, Italy 

Mr. John J. Gart, Oak Ridge National Laboratory, P. O. Box Y, Oak 
Ridge, Tennessee, U.S.A. 

Dr. Hans Gebelein, Wetzelstr. la, Bamberg, Germany 

Dr. Franz J. Geks, Farbenfabrikara Bayer, Leverkusen-Bayerwerk, 
Germany 

Dr. W. D. Hanson, Crop Research Division, Plant Industry Station, 
Beltsville, Maryland, U.S.A. 

Mr. Jacques Hardouin, Kerekere-D. sp. Bunia, Ituri (Prov. Or) Belgian 
Congo 

Dr. Eugene R. Harris, 3451 Ault View, Cincinnati 8, Ohio, U.S.A. 

Prof. Stanley R. Hill, 2727 Prewett Street, Los Angeles 31, California, 
U.S.A. 

Dr. Ted W. Horner, General Mills, Inc., 400 Second Avenue S., Minne- 
apolis, Minnesota, U.S.A. 

Dr. A. R. Khalil, 1-A-Mausa Ben Mimoon Street, Abbesieh, Cairo, 
Egypt 

Prof. Dr. Siegfried Koller, Ruckerstr. 7, Stat. Bundesamt, Wiesbaden, 
Germany 

Mr. Jacob L. Kovner, 633 Stover Street, Ft. Collins, Colorado, U.S.A. 

Dipl. Math Helmut Kregeloh, Mannesmann-Huttenwerke AG, BWSt.. 
Duisbery-Huckingen, Germany 

Dipl. Math R. Lang, Faberstr. 12, (14a) Stuttgart-S, Germany 

Dr. Dietrich Lorenz, Ostmerheimerstr. 198, Koln-Bruck, Germany 

Dr. Eschscholtzia L. Lucia, 125 Camben Drive, Apartment 12-E, 
Parkmerced., San Francisco, California, U.S.A. 

Miss Ethelyne L. McBee, 1113 8.W. First Avenue, Gainesville, Florida, 
U.S.A. 
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Mr. Paul Meier, Eckhart Hall, University of Chicago, Chicago 37, 
Illinois, U.S.A. 

Mr. John Lester Michael, Department of Psychology, University of 
Houston, Cullen Boulevard, Houston 4, Texas, U.S.A. 

Dr. Mario Orsenigo, Via F dall’Ongaro, Milano, Italy 

Dr. Bernard L. Oser, Food and Drug Research Laboratories, Inc., 

Maurice Avenue at 58th Street, Maspeth 78, New York, U.S.A. 

Dr. Bernard Ostle, 3120 Cardenas Drive, NE, Albuquerque, New 
Mexico, U.S.A. 

Dr. Per Ottestad, Norges Landbrukshogskole, Matemetisk Institutt, 
Vollebekk, Norway 

Dr. Nellie M. Payne, Velsicol Chemical Corporation, 330 East Grand 
Avenue, Chicago 11, Illinois, U.S.A. 

Prof. Dr. Richard Prigge, Sud 10, Paul-Ehrlichstr. 8, Frankfurt s. 
Main, Germany 

Dr. Anita E. Rapoport, 308 N. East Avenue, Oak Park, Illinois, U.S.A. 

Dr. Shiela Rowley, 8 Bay Street, Greenwich, New South Wales, Australia 

Dr. Giovanni F. Rubino, Via Nota 7, Torino, Italy 

Mr. A. A. Rutherford, Dept. of Statistics, University of Aberdeen, 
Meston Walk, Old Aberdeen, Scotland 

Mr. Gian Tommasco Scarascia, Istituto Scientific Sperimentale per i 
Tabacchi, Via Nazionale 66, Rome, Italy 

Dr. Giorgio Schreiber, Faculdade de Filosofia, Edificio Acaiaca, Av. 
Afinso Pena, Bela Horizonte-M.G., Brazil, South America 

Mr. Howard A. Schuck, 930 N. Logan Avenue, Colorado Springs, 
Colorado, U.S.A. 

Dr. Francesco Sella, Istituto di Igiene dell’Universita, Via Caterina 4, 
Siena, Italy 

Dr. Robert L. Stearman, 116 Rolling Road, Gaithersburg, Maryland, 
US.A. 

Dr. Klaus J. Stern, Siekar Landstr. 16, Schmalenbeck/Ahrenzburg, 
(Holst.) Germany 

Dr. Albert L. Tester, Chief, Division of Biological Research, Bureau of 
Commercial Fisheries, Washington 25, D. C., U.S.A. 

Prof. John W. Tukey, 630 Circle Drive, Palo Alto, California, U.S.A. 

Mr. G. A. Watterson, Australian National University, Box 4, G.P.O., 
Canberra A.C.T., Australia 

Mr. J. B. Wilmeth, P. O. Box 7413, Benjamin Franklin Station, Wash- 
ington 4, D. C., U.S.A. 

Dr. Manfred Woelke, Katzestreet 80, Sunnyside, Pretoria, Sudafrika, 
Umgibeli 51, South Africa 
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New Members 
At Large 


Dr. P. C. Butler, Division of Tropical Research, United Fruit Company, 
La Limas, Honduras, Central America 
S. Karatas, Esq., Ziraat Fakultesi, Zootakni Kursusu, Ankara, Turkey 


Australasia 


Dr. A. Brown, Dept. of Mathematics, University of Melbourne, Carleton 
N. 3, Victoria, Australia 


Belgian 


Mr. J. M. Bienfait, 71 Rue Timmermans, Bruxelles, Belgium 
Dr. Charles Lieber, 69 me de Hennin, Brussels, Belgium 
Mr. Julien F. M. Ronchaine, 70 Grand we, Gembloux, Belgium 


British 


Mr. T. B. Bagenal, Marine Station, Millport, Isle of Cumbrae, Scotland 

Mr. R. G. Carpenter, Dept. of Human Ecology, Fenner’s, Cambridge, 
England 

Mr. G. M. Clarke, University of Bristol, Research Station, Long Ashton, 
Bristol, England 

Dr. Z. P. Dienes, 245 Queens Road, Leicester, England 

Dr. J. H. Edwards, Dept. of Social Medicine, The Medical School, 
Edgbaston, Birmingham 15, England 

Mr. N. Ferguson, 28 London Road, Sittingbourne, Kent, England 

Mr. D. Hewitt, Dept. of Social Medicine, 8 Keble Road, Oxford, England 


ENAR 


Mr. Gordon L. Baskerville, Box 428, Federal Forestry Branch, Fred- 
ericton, N.B., Canada 

Mr. K. P. Bovard, Dept. of Animal Husbandry, Iowa State College, 
Ames, Iowa, U.S.A. 

Dr. E. I. Burdeck, 722 West 168 Street, New York 32, N. Y., U.S.A. 

Mr. Wilford L. Davis, N. C. State College, Institute of Statistics, Box 
5457, Raleigh, North Carolina, U.S.A. 

Mr. Frank Freese, Room 704, Lowich Building, 2026 St. Charles Street, 
New Orleans 13, Louisiana, U.S.A. 

Mr. Nicholas E. Manos, 3306 Jones Bridge Road, Washington 15, D. C., 
U.S.A. 

Dr. C. A. McMahan, Louisiana State University Medical School, 1542 
Tulane Avenue, New Orleans 12, Louisiana, U.S.A. 
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Dr. Stanley Wearden, Statistical Laboratory, Kansas State College, 
Manhattan, Kansas, U.S.A. 


Mr. C. W. Yeatman, Petawawa Forest Experiment Station, Chalk 
River, Ontario, Canada 


German 


Mr. W. Hartmann, Kreis Neustadt/Rbge., Max-Planck-Institut, 
Mariansee, Germany 


Indian 


Mr. N. K. Chakravarti, Senior Scientific Officer, T.D.E. Laboratories, 
Post Box No. 320, Kanpur, India 

Mr. R. L. Khanna, M.A., Assistant Statistician, Dept. of Agriculture, 
Himachal Pradesh Administration, Hawthorn Villa, Simla-4, India 

Dr. B. K. Mukerji, Director, Indian Institute of Sugarcane Research, 
Rae Bareli Road, P. O. Dilkusha, Lucknow, India 

Sri J. K. Pande, Director, Economics and Statistics Department, 
Sarojini Naidu Marg, Lucknow, India 

Mr. Rameshwar Saran, B. Sec. C. E. (Hons.), M. I. E., Director, Irriga- 
tion Research Institute, Roorkee, U.P., India 

Mr. B. D. Tikkiwal, Department of Statistics, Karnak University, 
Dharwar, India 


Italian 


Dr. A. Gerra, Via Vicenzo Monti 7, Milano, Italy 
Dr. A. Grimaldi, Institute di Agronomia, Facolta di Agraria, Perugia, 
Italy 


Dr. F. Nicolis, Lepetit S.p.a., Via Lepetit 10, Milano, Italy 


Japanese 


Prof. Shinya Iyama, Plant Breeding Laboratory, Faculty of Agriculture, 
Tokyo University, Bunko-ku, Tokyo, Japan 

Miss Kimiko Motomura, Kyushu University, Biological Institute, 
Faculty of Science, Hakozaki, Fukuoka-shi, Japan 

Miss Michie Shirafuji, Mathematical Institute, Faculty of Science, 
Hakozaki, Fukuoka-shi, Fukuoka, Japan 

Mr. Ryuzo Yamada, Takeda Pharmaceutical Indust. Ltd., 54 Nishi-e-4- 
chome, Higashi-Yodogawa-ku, Osaka, Japan 


Netherlands 
Mr. Marten Kuilman, N. V. Philips, Drachten (Fr.), Netherlands 
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Ir. D. H. M. van Stogteren, von Bonninghausenlaan 39, Lisse, Nether- 
lands 

WNAR 

Mr. Jacob L. Kovner, 633 Stover Street, Ft. Collins, Colorado, U.S.A. 


Deaths 


Mr. A. D. Grace, Manor Works, Ettingshall, Wolverhampton, England 
Professor E. Ullrich, Giessen, Johannesstr. 1, Germany 
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NEWS AND ANNOUNCEMENTS 


Members are invited to transmit to their National or Regional Secretary 
(if members at large, to the General Secretary) news of appointments, 
distinctions, or retirements and announcements of professional interest. 


Herbert A. David has joined the staff of the Virginia Polytechnic 
Institute as Professor of Statistics where he will be engaged in research 
and teaching in the graduate program of the Department of Statistics. 


Rolf Bargmann has taken a position as Associate Professor of Sta- 
tistics at the Virginia Polytechnic Institute. He will be engaged in 
the new cooperative program in biostatistics announced in the December 
1957 issue of Biometrics. 


REQUEST FOR CONTRIBUTED PAPERS 


Contributed papers for the Biometric Society Meetings in Chicago, 
December, 1958, will be received by 
Virgil L. Anderson 
Statistical and Computing Laboratory 
Purdue University 
Lafayette, Indiana. 
The program for contributed papers will be made up as soon as possible. 


COMMITTEE ON MATHEMATICAL TABLES 


Since its organization early in 1956, the Institute of Mathematical 
Statistics’ Committee on Mathematical Tables has been concerned with 
the problems associated, either directly or indirectly, with the computa- 
tion of mathematical tables of interest to statisticians. The committee’s 
function is threefold: 


(i) To gather information relating to the tabulation of functions 
of interest to statisticians. 

(ii) To advise on the need for, and preparation of, statistical tables. 

(iii) To determine the availability of and coordinate the distribution 
of free time on high speed digital computers for the computation 
of statistical tables. 
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In order to fulfill its function, the committee investigated the interests 
and needs of the Institute membership concerning statistical tables and 
as a result set up nine subcommittees covering the areas of greatest 
interest. To date the activities of these subcommittees have been 
directed primarily towards the preparation of bibliographies in their 
individual fields. The subcommittees, along with their chairmen, are 
listed below. 

1. Chi-Square, 

W. Kruskal, U. of Chicago 
2. ¢-Distribution (Univariate and Multivariate), 
C. W. Dunnett, American Cyanamid Co., Pearl River, New 
York 
3. Studentized Range, 
A. H. Bowker, Stanford U. 
4. F-Distribution (Incomplete-Beta, Binomial), 
E. E. Cureton, U. of Tennessee 
5. Hypergeometric Distribution (Not the hypergeometric function), 
W. Kruskal, U. of Chicago 
6. Polyvariate Normal Distribution, including latent roots, 
G. P. Steck, Sandia Corp., Albuquerque, New Mexico 
7. Availability of Simple Techniques, 
Chairman not appointed 

8. Annals Supplement (of statistical tables), 

J. W. Tukey, Bell Telephone Labs., Murray Hill, New Jersey 

9. Computing Facilities and Cost-Free Machine Time, 

F. C. Leone, Case Inst. of Tech. 

Additional information on any of the above activities may be obtained 
from the chairman of the Committee on Mathematical Tables, D. B. 
Owen, Sandia Corporation, Albuquerque, New Mexico, or from any 
of the subcommittee chairmen. Anyone having time on a digital 
computer which may be made available on a cost-free basis to persons 
desiring to compute tables of general interest is invited to contact the 
chairman of subcommittee 9. 


ROYAL STATISTICAL SOCIETY CONFERENCE 


RESEARCH AND INDUSTRIAL APPLICATIONS SECTIONS 


The Research Section and the Industrial Applications Section of 
the Royal Statistical Society intend to hold a Conference at the Uni- 
versity of St. Andrews, near Edinburgh, Scotland, from the 22 August 
to the 25 August inclusive. It will be devoted to Mathematical Sta- 
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tistics, with special reference to those branches of the subject which 
have application in industry. 

It is proposed that there should be three morning sessions, consisting 
each of two or three pre-arranged lectures, and three early evening 
sessions (5:30 p.m. to 6:30 p.m.) each with one pre-arranged lecture. 

The afternoons will be devoted to ‘Splinter Groups’ which will 
devote themselves to special aspects, and at which informal talks of 
some ten or fifteen minutes each can be given without prior arrangement. 

Topics to be covered in the morning and evening sessions include 
aspects of the analysis of variance, non-parametric inference, stochastic 
aspects of linear and dynamic programming, and foundations of prob- 
ability in statistics. 

It is hoped that many of the mathematical statisticians who will be 
coming from abroad to attend the Edinburgh International Congress 
of Mathematicians, will choose to remain in Scotland for a further few 
days, and take the opportunity of meeting colleagues specially interested 
in their field. St. Andrews, besides having the famous Golf Course, is a 
small Scottish town of considerable character, and a very good centre 
for the exploration of the Eastern Highlands. 

Accommodation (from 21 August to 26 August) will be provided 
within the hostels of the University of St. Andrews at a reasonably 
low cost, details to follow later. Anyone interested should write, 
marking the envelope ‘ST. ANDREWS CONFERENCE’, to Miss U. 
Croker, Royal Statistical Society, 21 Bentinck Street, London, W. 1, 
England. 


NINTH INTERNATIONAL BOTANICAL CONGRESS 


The Ninth International Botanical Congress will be held in Montreal, 
Canada, from August 19 to 29, 1959, at McGill University and the 
University of Montreal. The program will include papers and symposia 
related to all branches of pure and applied botany. A first circular 
giving information on program, accommodation, excursions, and other 
detail will be available early in 1958. This circular and subsequent 
circulars including application forms will be sent only to those who write 
to the Secretary-General asking to be placed on the Congress mailing list: 

Dr. C. Frankton 

Secretary-General 

IX International Botanical Congress 
Science Service Building 

Ottawa, Ontario 

Canada. 
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SYMPOSIUM ON NUMERICAL APPROXIMATION 


A Symposium on Numerical Approximation sponsored by the 
Mathematics Research Center, U. S. Army, will be held April 20 to 23 
at the University of Wisconsin, Madison. The topics of the Symposium 
include linear approximation, interpolation, Tchebycheff and other 
extremal approximations, expansions and algorithms. 

One-hour surveys (including a survey of recent Russian literature) 
and 30-minute research papers will be presented. There will be 
opportunity for formal and informal discussion. It is intended to have 
the proceedings of the Symposium published. 

Approximately twenty speakers will participate. Among these are 
the following guests from abroad: L. Collatz, L. Fox, Z. Kopal, C. P. 
Miller, A. Ostrowski, and E. L. Stiefel. 

Workers in the field interested in attending the Symposium are 
urged to write to 

Professor R. E. Langer, Director 
Mathematics Research Center, U. 8. Army 
University of Wisconsin 

1118 W. Johnson Street 

Madison 6, Wisconsin. 


SUMMER SESSIONS AT BERKELEY, CALIFORNIA 


The 1958 summer program in the Department of Statistics of the 
University of California, Berkeley, California, will consist of two sessions: 
June 16 to July 26 and July 28 to September 6. The faculty of the 
summer sessions will include Professor U. 8. Nair of Travancore Uni- 
versity in India, Dr. F. N. David of University College in London, and 
Professors David Blackwell, Evelyn Fix, Joseph L. Hodges, Jr., and 
J. Neyman of the Department of Statistics of the University of Cali- 
fornia, Berkeley. 

The program will include two undergraduate courses in each session, 
offered primarily to meet the needs of students transferring from other 
centers who would like to undertake advanced study at the University 
of California during the regular academic year. In addition there will 
be two research seminars. A seminar in statistical problems of health 
will continue through both sessions. The discussions on the health 
problem will begin with the medico-biological aspects which will then 
be followed by the statistical treatment. The other research seminar, 
to be given in the first session only, will be devoted to the statistical 
study of structural relations in the physical sciences. 
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SOUTHERN REGIONAL GRADUATE SUMMER SESSION 
IN STATISTICS 


Oklahoma State University, Stillwater, Oklahoma, is host to the 1958 
(June 16 to July 25) Southern Regional Graduate Summer Session in 
Statistics sponsored by North Carolina State College, University of 
Florida, Virginia Polytechnic Institute, and Oklahoma State University. 
These summer sessions are offered in rotation by the sponsoring institu- 
tions and provide for a basic graduate program in statistics for Statis- 
ticians, Research Workers, Teachers of Statistics, Biologists, Social 
Scientists, Physical Scientists, and Mathematicians. Visiting professors 
to the 1958 summer session are Dr. H. O. Hartley, Iowa State College, 
Dr. Walter Federer, Cornell, and Dr. Paul Minton, Southern Methodist 
University. 

A series of weekly symposia will constitute an enriching experience 
to all summer school participants. 

June 16-20 Response Surfaces G. E. P. Box 

Princeton University 
June 23-27 High Speed John Hamblen 
Computing Oklahoma State University 
June 30-July 3 Sampling Designs UH. O. Hartley 
Iowa State College 


July 7-11 Nonparametric Ralph A. Bradley 
Statistics Virginia Polytechnic Institute 
July 14-18 Multiple David B. Duncan 
Comparisons Univ. of North Carolina 
July 21-24 Experimental Walter Federer 
Design Cornell University 


For further information write to Carl E. Marshall, Director of 


Statistical Laboratory, Oklahoma State University, Stillwater, Okla- 
homa. 
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ARTICLES 


“An Outlook Report’ (Presidential Address) R. LEeonarp 
Tinbergen on Economic Policy (review article) KENNETH J. ARROW 
Smoking and Lung Cancer: Some Observations on two Recent Reports 
JOSEPH BERKSON 
A Statistical Model for Life-Length of Materials 
Z. W. BrrnBaum ANp S. C. SAUNDERS 
Probabilistic Interpretations for the Mean Square Contingency 
H. M. Buatock, Jr. 
Procedure for Computing Regression Coefficients Duprey J. CowpEN 
Some Aspects of the Use of the Sequential Probability Ratio Test 
M. H. DeGroot anp JacK NADLER 
A Modified | Doolittle Approach for Multiple and Partial Correlation and 


An Experiment with Weighted Indexes of Cyclical Diffusion 
Bert G. HicKMAN 
On the Use of Randomization in the Investigation of Unknown Functions 
Rosert Hooke 
Linear Curve Fitting Using Least Deviations 
Philippine Statistical Program Development and the Survey of Households 
Mitton D. LresERMAN 
On the Relative Accuracy of Some Sampling Techniques 
Use of Varying Seasonal Weights in Price Index Construction 
Doris P. RoTHWELL 
Fitting Straight Lines When One Variable is Controlled. ... Hmanry Scuerré 
On Ranking Parameters of Scale in Type III Populations K. C. SEAL 
Karl Pearson—An Appreciation on the 100th Anniversary 
SamuEt A. STouFFER 
The Contributions of Karl Pearson HELEN M. WALKER 
On the Distribution of Solutions in Linear Programming Problems 
Harvey M. WaGNnER 
A Statistical Analysis of Provisional Estimates of Gross National Product and 
Its Components, of Selected National Income Components, and 
of Personal Saving ARNOLD ZELLNEB 


BOOK REVIEWS 
PUBLICATIONS RECEIVED 


VOLUME 53 NUMBER 281 


Information on memberships, subscriptions, and back numbers should be 
requested from the Executive Director, American Statistical Association, 
1757 K Street, N.W., Washington 6, D. C., U.S.A. 
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INFORMATION FOR CONTRIBUTORS 


MANUSCRIPTS 


Contributions for Biometrics may be addressed to Dr. Ralph A. Bradley, Depart- 
ment of Statistics, Virginia Polytechnic -Institute, Blacksburg, Virginia, U.S.A.; 
authors residing in the following Society Regions can expedite consideration of papers 
by submitting them to the appropriate Associate Editor, namely; BRITISH RE- 
GION: Dr. 8. C. Pearce, East Malling Research Station, East Malling, Maidstone, 
Kent, England. AUSTRALIAN REGION: Dr. E. A. Cornish, University of 
Adelaide, Adelaide, Australia; FRENCH REGION: Dr. Georges Teissier, Faculté 
des Sciences de Paris, 1 rue V. Cousin, Paris, France. QUERIES, NOTES, and 
related correspondence should be directed to Professor G. W. Snedecor, Statistical 
Laboratory, Iowa State College, Ames, Iowa, U.S.A. 

MANUSCRIPTS must be submitted in triplicate, with typescript doublespaced 
throughout. Marginal notes may obviate typographical difficulties presented by 
complicated formulae or tables—authors should not attempt editorial instructions 
or markings for the printer. TABLES should be identified by arabic number and 
by a short descriptive title. ILLUSTRATIONS should also be identified by arabic 
number and by a brief caption. (Captions should not be included in illustrations, 
but should be typewritten collectively on an accompanying sheet.) Originals 
should be approximately 8.5 x 11 in. (21.5 x 28 cm.). The original of each chart, 
diagram or graph should be executed in black on white drawing paper or board, on 
blue tracing linen, or on coordinate paper ruled in blue only; coordinate lines to be 
reproduced should be ruled in black. For printing, illustrations may be reduced to 
¥ or ¥ original dimensions. Lines should therefore be of sufficient thickness, and 
decimal points, periods, and stippled dots should be solid black circles large enough 
to reproduce well. Lettering and numerals should be at least 1 mm. high when 
reproduced in a cut 3 in. (7.5 cm.) wide. Photographs should be prints on glossy 
paper with strong contrasts, and if grouped in a plate should be mounted contig- 
uously. All tables and illustrations should be mentioned explicitly in the text. 


REFERENCES (BIBLIOGRAPHIC) should be collectively listed alphabetically 
by author; textual citation by author and year is preferred. 


ABSTRACTS 


Abstracts of papers presented at meetings of the Biometric Society or of its 
regions are printed in Biometrics following such meetings. They should be submitted 
to the person designated to receive them for a particular meeting in exactly the form 
published in Biometrics (except for an Abstract Number), doublespaced on bond 
paper and in duplicate. Use of formulae requiring display printing is to be avoided. 


Notices, ANNOUNCEMENTS AND Biometric Society Reports 


International and regional reports and notices should be submitted by the 
appropriate officers of the Society and its Regions in duplicate doublespaced on - 
separate sheets exactly as they are to be printed in Biometrics. Other material to 
be printed in News and Announcements should also be submitted doublespaced 
and in duplicate. 


SustTaintinc MeMBERS OF THE Biometric Society 


Abbot Laboratories 

American Cancer Society, Inc. 

Merck, Sharp and Dohme Research Laboratories 
Schering Corporation 

Smith, Kline and French Laboratories 

E. R. Squibb and Sons 

Wyeth Institute of Applied Biochemistry 
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BACK ISSUES 


Back issues of Biometrics are available at the following postage-paid 
prices in U.S.A. currency: 


Price per Price per 
Volume Number Single Number Volume(unbound) 


1 to6 
1 to 6 
1 to4 
1 to4 
1 to4 
1 to 4 
1 to4 
1 to4 
1 to4 
1 to 4 
1 to 4 
1to4 
1to4 
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Inquiries, non-member subscriptions, and orders for back issues should 
be addressed to: 


BIoMETRICS 

DEPARTMENT OF STATISTICS 
VirGiniA INSTITUTE 
VirGinis, U.S.A. 


Reprints of individual articles are not available except to authors at the 
time of printing. Three special issues are among the numbers listed 
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